×
Vidisha Chirmulay
Senior MarCom Executive
Vidisha, a Senior Marcom Executive at Nitor Infotech, enjoys writing about technology. The realm of marketing communication intrigues her.... Read More

If you’ve been looking to read about a highly powerful generative tool in the machine learning realm, you have arrived at the right place.

Allow me to begin with a confession.

As I sat down to write this blog and started hunting for everything I needed to know about variational autoencoders (VAEs), one word that was used to describe VAEs and jumped out to catch my attention was ‘beautiful’.

I was slightly surprised at the choice of adjective, but I grew more and more convinced about it as I continued my search.

As we set off on our exploration of VAEs in today’s blog, let’s first understand what autoencoders are.

What are autoencoders?

An autoencoder is an unsupervised ML algorithm. The purpose of its existence is to achieve a low(er) dimensional representation of your input. This idea is quite deep if you muse over it.

First, let’s understand what a representation is. It is how you opt to describe something, in a manner that works suitably enough for your purposes.

An autoencoder:

  • takes an image,
  • analyses it through the convolutional encoder,
  • creates a latent representation of the image, and
  • creates the output image (via the deconvolutional decoder)

Now, what is a VAE all about? Let’s take a look.

Why Variational Autoencoders (VAEs)?

Variational autoencoders (VAEs) are generative models that offer a probabilistic way for illustrating an observation in latent space.

They are plainly designed to capture the underlying probability distribution of a given dataset and come up with new samples.

While using generative models, you might merely wish to create a random, fresh output, that looks like the training data, and you can do that with VAEs. But it’s more likely that you’d want to change or discover variations on data you have. You’d want to do this in a desired, particular direction, rather than randomly. This is where VAEs work better than any other method currently available.

As you can imagine, it’s not a plain vanilla concept. So how is it different from vanilla autoencoders, then? Read on…

Difference between a vanilla autoencoder and the variational one

Vanilla autoencoder Variational autoencoder
The model is trained by minimizing the difference between the initial input and the rebuilt output. This basic format is referred to as the vanilla autoencoder. A VAE learns a generally, normal distribution for the vectors in the latent space. In other terms, there are several ways for images to be encoded into and decoded from the latent space.
Linear samples between two encodings cannot lead to great generated samples. The reason is that there are discontinuities in the embedding space that don’t lead to smooth transitions. The latent spaces of VAEs are continuous (design-wise). This leads to smooth random sampling and interpolation.

Now that you know the difference between the two, let’s take a closer look at a VAE…

Architecture of the VAE

The VAE architecture comprises two parts — the encoder and decoder.

You use the encoder to map the input to a latent space. Meanwhile, the decoder maps the latent space back to the primary input.

Since the decoder is used to reconstruct input information, it’s created as the precise opposite of the encoder. The encoder and decoder together form the VAE’s bowtie shape.

Architecture of the VAE

Fig: Architecture of the VAE

Source: Variational Autoencoders Simply Explained | by Ayan Nair | Becoming Human: Artificial Intelligence Magazine

Now that you are familiar with the architecture, let’s look at what happens during the training of a VAE.

What happens during the training of a VAE

In the world of VAEs, machines are trained to understand the essence of data and then unleash their imaginative powers. In this section, we’ll unravel the secrets behind the scenes and explore what happens during the training of a VAE.

Training of a VAE

Fig: Training of a VAE

  1. Understanding the Foundation:
    A VAE is a type of artificial neural network designed for unsupervised learning and generative tasks. It consists of an encoder, a decoder, and a latent space. The encoder compresses input data into a meaningful representation in the latent space. The decoder reconstructs the original data from this representation.
  2. The Quest for a Magical Code:
    Imagine your data as a collection of unique stories, each with its own characters, settings, and plot twists. During training, the VAE becomes like an attentive reader. It deciphers the essential elements of these stories and encodes them into a set of numbers. This set of numbers is the latent space, a magical code that captures the essence of the input data.
  3. Balancing Act:
    The VAE aims to strike a delicate balance between two crucial objectives – reconstruction and regularization. The reconstruction objective drives the VAE to faithfully recreate the input data. The regularization term ensures that the latent space is well-behaved. It encourages the model to learn meaningful and structured representations.
  4. Sampling Adventures in Latent Space:
    One of the enchanting aspects of VAEs is their ability to generate new and diverse content. This is made possible by sampling from the learned latent space. Picture the latent space as a vast, unexplored realm of possibilities, and the VAE as an intrepid explorer navigating through it. During training, the VAE learns to sample from this space. It enables it to generate novel variations of the input data.
  5. Losses and Lessons:
    Training a VAE involves a dance with loss functions. The reconstruction loss measures how well the generated data matches the input. This pushes the VAE to become a master storyteller. Simultaneously, the regularization term imparts structure and coherence to the latent space. This guides the VAE in creating meaningful variations.
  6. Iterations and Fine-Tuning:
    Like any skilled artist, the VAE may not get it right on the first try. Training involves multiple iterations. During this, hyperparameters are fine-tuned, architectures are adjusted, and the model learns from its mistakes. It’s a journey of continuous improvement. It refines the VAE’s ability to capture the essence of the data.
  7. Unleashing Creativity:
    As training progresses, the VAE transforms from a novice to a creative genius. Armed with the knowledge encoded in the latent space, it becomes capable of generating new and diverse content. Whether it’s generating artwork, creating music, or imagining unique data points, the VAE emerges as a versatile and imaginative entity.
  8. The Grand Finale: Deployment:
    Once the VAE has mastered the art of representation and generation, it is ready for its debut. Deploying the trained model allows it to create new, never-before-seen data, or represent existing data in a meaningful and compact form. This makes it a valuable tool in various machine learning applications.

Now, let’s jump into how VAEs function in the real world!

Applications of VAEs:

Here are some significant applications of VAEs across industries:

Applications of VAEs

Fig: Applications of VAEs

  • Image generation: In the realm of healthcare, VAEs make sense of medical images, diagnose diseases, and recreate quality images from restricted data. All of this, to enhance images and medical imaging applications. Not just that, VAEs prove to be wonderful collaborators for video games and movies.
  • Text generation: Poems, narratives, chatbots’ speech, translation – you name it, and VAEs have got the text covered.
  • Music generation: Whether it’s coming up with fresh music composition tools, innovating with new music genres and styles, or providing personalized musical recommendations, VAEs play a role in all of these activities.
  • Anomaly detection: It’s crucial to highlight fraudulent financial activities and zero in on identifying manufacturing defects. VAEs aid in such anomaly detection.

Time to look at what is on the horizon…

The future of VAEs

In a nutshell, VAEs have the capacity to transform generative AI as we know it. As they go through development and improvement, they will empower newer applications across various industries, supporting tons of creativity.

The path to mastering VAEs could be difficult, but the rewards are many. Thanks to VAEs, we have a dynamic tool to navigate the computational landscape smoothly.

Well, VAEs are beautiful, there’s no doubt.

Send us an email with your thoughts about the beauty of VAEs. Connect with us at Nitor Infotech to learn what we do as a software company, and in the world of generative AI.

Glide through the nitty-gritties of product modernization with GenAI!

subscribe image

Subscribe to our
fortnightly newsletter!

we'll keep you in the loop with everything that's trending in the tech world.

We use cookies to ensure that we give you the best experience on our website. If you continue to use this site we will assume that you are happy with it. Accept Cookie policy