How does Generative AI work?

About the author

Nitor Infotech Blog

Nitor Infotech is a leading software product development company serving ISVs, enterprises, and private equity firms globally.

Artificial intelligence | 12 Jun 2024 | 13 min |

If you’re a software developer, marketer, or business owner, you’ve likely heard a lot about what Generative Artificial Intelligence (GenAI) can do. If not, I would first suggest you get a complete understanding about what GenAI is in about 12 minutes.

In case you’re running out of time, here’s a brief definition of what is GenAI:

Generative AI or GenAI can be referred to as the subset of artificial intelligence that can generate or create human-like texts, images, videos, sounds, 3D models and more from vast sets of data. It uses advanced algorithms and neural networks to mimic human creativity and generate new content.

This image here is generated by a Large Language Model (LLM) called “Bing AI” which works with this GenAI technology:

Working with robots

Prompt used: A 3D software developer working with robots

Much like the above example, with GenAI in your organizational bucket, you can:

streamline your development process
enhance your creative roadmap
make informed business decisions

Bonus: Start building a Gen-AI powered software product with effective data strategies in place today!

Modernize your product with GenAI and fast track your GTM (Go-to-Market).

Download Guide

Clear with the basics? Great!

Now, let’s delve into the fascinating process behind this generative magic.

How does GenAI work?

I aim to simplify the mechanism of GenAI. Here are the three easy-to-follow steps, helping you navigate through complex information:

Recognizes Patterns and Structures
Works with Neural Networks
Gets Trained on Diverse Data

Let’s break down the above mechanism!

1. Understanding Patterns and Structures:

At the center, GenAI or Generative AI, uses advanced math and huge amounts of data to recognize patterns in various types of human-created content. This understanding lets it create new content that looks and sounds like the examples it has learned from.

2. Working with Neural Networks:

GenAI gets its power from various models that are built from various neural networks. These web-like interconnected nodes can identify and create key features of the data they are trained on.

What actually happens?

Well, the neutral networks are fed with enormous volumes of data, ranging from written text to visual imagery and audio recordings, which they analyze and internalize. Through this process of unsupervised learning, the models develop an understanding of the underlying grammar, syntax, style, and semantics that define different types of content.

For example, ChatGPT-3, a language model by OpenAI, is trained on a vast amount of text from books, articles, and web pages (amounting to in billions of parameters). It learns language patterns and relationships, encoding this understanding in its neural network. This allows it to generate new, coherent text that replicates the original data.

3. Training on Diverse Data:

Once the models have been sufficiently trained, they can then be used to generate new, original content. This is accomplished through a process known as “conditional generation” using prompt engineering work where the model is guided by desired user prompts. It then uses its learned knowledge to produce content that is often surprisingly human-like.

Note: Every model may function differently depending on the type of mechanism and the data it has been trained on.

For example, when prompted to “write a 50-word story about a man who sets sail toward an unknown island,” a text-based AI model will draw from its extensive knowledge to craft a story that accurately fulfills the task while adding its own creative touch.

Compare the outputs generated by both ChatGPT 3.5 and Microsoft Copilot:

OpenAI Copilot Output

Fig: Output by OpenAI Copilot

Microsoft Copilot Output

Fig: Output by Microsoft Copilot

Similarly, image-based AI models can turn text into realistic images. They associate visual elements, colors, and patterns with words to create new images matching the descriptions/prompts.

Here are some images generated by Adobe Firefly:

Prompt used: A child playing with robotic toys

Note: Large Language Models (LLMs) generate responses based on learned patterns, like autocorrect but more advanced. Sometimes, they can produce incorrect but seemingly meaningful responses, a phenomenon known as “hallucination.”

For example, a traditional logic system would solve 12-5 as 7. In comparison, an LLM might incorrectly link the sequence to Pi and give a response like “12-5=7 ->. -> 1 -> 4,” instead of the correct answer.

Learn to train your LLM models and steer away from bias and hallucinations.

Coming back to the idea that “different models work differently,” let’s take a leap further into the unknown.

How does each GenAI Model work?

There are primarily four types of GenAI models, including:

Diffusion Models
Variational Autoencoders (VAEs)
Generative Adversarial Networks (GANs)
Transformer Models

Here’s a preview of how each model functions (for more details, click the links above the name of each model):

Diffusion models: The two major steps in diffusion models are forward diffusion and reverse diffusion. The forward diffusion process slowly adds random noise to training data, while the reverse process reverses the noise to reconstruct the data samples. Novel data can be generated by running the reverse denoising process starting from entirely random noise.
Variational autoencoders (VAEs): VAEs consist of two neural networks known as the encoder and decoder.
When given an input, an encoder converts it into a smaller, more dense representation of the data. This compressed representation preserves the information that’s needed for a decoder to reconstruct the original input data, while discarding any irrelevant information. The encoder and decoder work together to learn an efficient and simple latent data representation.
Generative adversarial networks (GANs): GANs pair two neural networks against each other: a generator that generates new examples and a discriminator that learns to distinguish the generated content as either real (from the domain) or fake (generated).
The two models are trained together and get smarter as the generator produces better content and the discriminator gets better at spotting the generated content. This procedure repeats, pushing both to continually improve after every iteration until the generated content is indistinguishable from the existing content.
Transformer Models: At the core of transformer models is the self-attention mechanism, which allows the model to dynamically weigh the importance of different parts of the input when generating new content. This contrasts with the sequential, recurrent nature of RNNs (Recurrent Neural Networks), where information moves linearly through the network.
By now, you must have understood the mechanism behind GenAI and its different models. However, if you’re skeptical about the effectiveness of GenAI models, the factors up next can provide you with some clarity.

Evaluating GenAI Models

Evaluating Generative AI (GenAI) models involves:

Defining Objectives and Metrics: Identify goals and choose metrics like perplexity, BLEU (Bilingual Evaluation Understudy Score), FID (Fréchet Inception Distance), etc.
Quantitative Evaluation: Apply statistical measures to evaluate performance accurately.
Qualitative Evaluation: Conduct human assessments through evaluations and user studies to gather insightful feedback.
Task-Specific Evaluations: Customize evaluations for specific tasks like text completion and image captioning to ensure relevance.
Consistency and Diversity: Verify that outputs are both diverse and consistent to maintain quality and reliability.
Ethical and Bias Evaluation: Identify and address biases to ensure ethical and fair model behavior.
Continuous Monitoring and Improvement: Regularly update and enhance the model based on ongoing feedback and performance reviews.

So, the success of generative AI is largely due to the rapid advancements in computational power, data storage, and machine learning algorithms that have occurred in recent years. As these technologies continue to evolve, the capabilities of generative AI systems are expected to grow exponentially, enabling them to tackle increasingly complex and creative tasks.

If you’re looking to leverage the potential of GenAI technology, reach out to us at Nitor Infotech. We can help you build game-changing GenAI-powered software products while implementing industry best practices.