0% found this document useful (0 votes)
5 views23 pages

GenAI Autoencoder

Module 5 covers generative AI models, focusing on their ability to learn patterns from training data to generate new content. It details autoencoders, including their architecture, types, and applications such as dimensionality reduction, feature extraction, image denoising, and compression. The module also discusses various types of autoencoders, including vanilla, denoising, and stacked autoencoders.

Uploaded by

febinsunny2004
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views23 pages

GenAI Autoencoder

Module 5 covers generative AI models, focusing on their ability to learn patterns from training data to generate new content. It details autoencoders, including their architecture, types, and applications such as dimensionality reduction, feature extraction, image denoising, and compression. The module also discusses various types of autoencoders, including vanilla, denoising, and stacked autoencoders.

Uploaded by

febinsunny2004
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Module 5

Generative AI Models

Contents
➢ Introduction to Generative Models: Overview of generative models,
Types of generative models (e.g., GANs, VAEs)
➢ Autoencoders: Basics of autoencoders, Variational Autoencoders (VAEs)
➢ Generative Adversarial Networks (GANs): Introduction to GAN
architecture, Training GANs, Applications of GANs
Introduction to Generative Models

➢ Generative models are a class of artificial intelligence (AI) that learn the
underlying patterns, structure, and probability distribution of training data to
generate new, original samples—such as images, text, audio, or 3D models—that
resemble the input data.
➢ A generative model is a machine learning model designed to create new data that
is similar to its training data. Generative artificial intelligence (AI) models learn
the patterns and distributions of the training data, then apply those
understandings to generate novel content in response to new input data.
- IBM
➢ Major types including Variational Autoencoders (VAEs), Generative Adversarial
Networks (GANs), Diffusion Models, and Transformers.
Autoencoders
● Autoencoders are a class of unsupervised neural network (since they don't
need explicit labels to train on) where the input is same as the output.
● The aim of an autoencoder is to learn a lower-dimensional representation
(encoding) for a higher-dimensional data.
● They compress the input into a lower dimensional code and then
reconstruct the output from this representation.
The architecture of autoencoders:

Autoencoders consist of 3 parts:

1. Encoder: A module that compresses the train-validate-test set input data


into an encoded representation that is typically several orders of magnitude
smaller than the input data. It compress and produces the code

2. Bottleneck: A module that contains the compressed knowledge


representations and is therefore the most important part of the network.

3. Decoder: A module that helps the network“decompress” the knowledge


representations and reconstructs the data back from its encoded form. The
output is then compared with a ground truth.
The relationship between the Encoder, Bottleneck, and Decoder
Encoder

The encoder is a set of convolutional blocks followed by pooling modules that compress
the input to the model into a compact section called the bottleneck.

Bottleneck
● The most important part of the neural network, and ironically the smallest one, is
the bottleneck.
● The bottleneck is designed in such a way that the maximum information possessed
by an image is captured in it, we can say that the bottleneck helps us form a
knowledge-representation of the input.
● A bottleneck as a compressed representation of the input further prevents the
neural network from memorising the input and overfitting on the data
Decoder
The decoder is a set of upsampling and convolutional blocks that reconstructs the
bottleneck's output.

The number of hidden units in the autoencoder is typically less than the number of input (and
output) units. This forces the encoder to learn a compressed representation of the input, which
the decoder reconstructs.
• If there is a structure in the input data in the form of correlations between input
features, then the autoencoder will discover some of these correlations, and end up
learning a low-dimensional representation of the data similar to that learned using
principal component analysis (PCA).

• Once the autoencoder is trained, we would typically just discard the decoder
component and use the encoder component to generate compact representations of
the input.

• Alternatively, we could use the encoder as a feature detector that generates a


compact, semantically rich representation of our input and build a classifier by
attaching a softmax classifier to the hidden layer.
Applications of Autoencoder
1. Dimensionality Reduction
Autoencoders train the network to explain the natural structure in the
data into efficient lower-dimensional representation. It does this by
using decoding and encoding strategy to minimize the reconstruction
error

The input and the output


dimension have 3000
dimensions, and the desired
reduced dimension is 200.
2. Feature Extraction
● Autoencoders can be used as a feature extractor for classification or
regression tasks.
● Autoencoders take unlabeled data and learn efficient codings about
the structure of the data that can be used for supervised learning
tasks.
● After training an autoencoder network using a sample of training
data, we can ignore the decoder part of the autoencoder, and only use
the encoder to convert raw input data of higher dimension to a lower
dimension encoded space.
● This lower dimension of data can be used as a feature for supervised
tasks.
3. Image Denoising

● The real-world raw input data is often noisy in nature, and to train a
robust supervised model requires cleaned and noiseless data.
Autoencoders can be used to denoise the data.

● Image denoising is one of the popular applications where the


autoencoders try to reconstruct the noiseless image from a noisy input
image.
4. Image Compression:

● Image compression is another application of an autoencoder network.


● The raw input image can be passed to the encoder network and obtained a
compressed dimension of encoded data.
● The autoencoder network weights can be learned by reconstructing the
image from the compressed encoding using a decoder network.
• We can think of autoencoders as consisting of two cascaded networks.

• The first network is an encoder, it takes the input x, and encodes it using a
transformation h to an encoded signal y, that is:

• The second network uses the encoded signal y as its input and performs
another transformation f to get a reconstructed signal r, that is:

• We define error, e, as the difference between the original input x and the
reconstructed signal r, e= x- r.

• The network then learns by reducing the loss function (for example mean
squared error (MSE)), and the error is propagated backwards to the hidden
layers as in the case of MLPs.
● Depending upon the actual dimensions of the encoded layer with respect to
the input, the loss function, and constraints, there are various types of
autoencoders:

■ Vanilla Autoencoders
■ Denoising autoencoders,
■ Stacked autoencoders
■ Sparse autoencoders
■ Variational Autoencoders
Vanilla autoencoders
• The Vanilla autoencoder, as proposed by Hinton in his 2006 paper Reducing
the Dimensionality of Data with Neural Networks, consists of one hidden
layer only.

• The number of neurons in the hidden layer are less than the number of
neurons in the input (or output) layer.

• This results in producing a bottleneck effect in the flow of information in the


network. The hidden layer in between is also called the "bottleneck layer.“

• Learning in the autoencoder consists of developing a compact representation


of the input signal at the hidden layer so that the output layer can faithfully
reproduce the original input.
Denoising autoencoders
• A denoising autoencoder learns from a corrupted (noisy) input; it feed its
encoder network the noisy input, and then the reconstructed image from the
decoder is compared with the original input.
• The idea is that this will help the network learn how to denoise an input.
• It will no longer just make pixel-wise comparisons, but in order to denoise it
will learn the information of neighboring pixels as well.
• The corruption process typically follows one of two approaches.
• Approach 1:
○ We can randomly set some of the inputs (as many as half of them) to
zero or one; most commonly it is setting random values to zero to imply
missing [Link] can be done by manually inputting zeros or ones
into the inputs or adding a dropout layer between the inputs and first
hidden layer.
• Approach 2: adding pure Gaussian noise
○ Training a denoising autoencoder is nearly the same process as training
a regular autoencoder. The only difference is we supply our corrupted
inputs to training_frame and supply the non-corrupted inputs
to validation_frame.
Results
Stacked autoencoder
• Until now we have restricted ourselves to autoencoders with only one hidden
layer.
• We can build Deep autoencoders by stacking many layers of both
encoder and decoder; such an autoencoder is called a Stacked autoencoder.
• The stacked autoencoder can be trained as a whole network with an aim to
minimize the reconstruction error.
● Thus stacked autoencoders are nothing but Deep autoencoders having multiple hidden
layers. With more hidden layers, the autoencoders can learns more complex coding.
● When the deep autoencoder network is a convolutional network, we call it a
Convolutional Autoencoder

Convolutional autoencoder for removing noise from images

You might also like