DL Lecture 18 Autoencoders

The document discusses autoencoders, a type of neural network architecture used for dimensionality reduction and data compression by encoding input data into a lower-dimensional space and then reconstructing it. It highlights the advantages of autoencoders over traditional methods like PCA, especially in handling non-linear data, and details their structure, applications, and various types such as denoising and variational autoencoders. Additionally, it covers the importance of hyperparameters and loss functions in optimizing autoencoder performance.

Uploaded by

realpokemonfan29

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views40 pages

DL Lecture 18 Autoencoders

Uploaded by

realpokemonfan29

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

AUTOENCODERS

Understanding Dimensionality Reduction

❑Dimensionality reduction is an
essential technique that helps simplify
high-dimensional data while retaining
the most critical features.
❑Transformation is from high
dimensional space to a low-
dimensional space while retaining
meaningful properties of the original
data.
❑This helps in dealing with curse of
dimensionality.
Popular Methods
❑Principal Component Analysis (PCA)
❑T-SNE
❑Linear discriminant Analysis (LDA)
PCA overview
PCA helps in reducing the original
data representation by transforming it
into low-dimensional space.
PCA represents data into principle
components where PC1 retains more
information than PC2.

This helps in significantly reducing the

data size and thus eliminating curse of
dimensionality!
Issue with PCA

Suppose we have non-linear representation

of data
Issue with PCA

In this case PCA fails to

transform the data effectively
thus resulting in meaningless
transformation!

Before PCA After PCA

Tackling non-linear data reduction

For non-linear data

representation researchers
introduced several methods like-
❑Non-linear PCA/ kernel PCA
❑Autoencoders

Before After
Here PCA fails to capture non-
linear relationship, which
autoencoder can easily capture!!
Introduction to Autoencoders
❑Introduced in 1980s
❑They are a type of Neural N/W architecture
designed to efficiently compress (encode)
input data down to its essential features, then
reconstruct (decode) the original input from
this compressed representation.
❑Most traditional application was
dimensionality reduction!
❑They are part of unsupervised learning
Applications

In today’s time autoencoders are primarily

used for Data Compression.

Image compression
Image Reconstruction
Generative tasks
Understanding autoencoders

Autoencoders consists of 3 parts-

❑Encoder
❑Bottleneck layer
❑Decoder
Encoder
The encoder is a set of convolutional
blocks followed by pooling modules
that compress the input to the model
into a compact section.
The module compresses the train-
validate-test set input data into an
encoded representation that is
typically several orders of magnitude
smaller than the input data.
Bottleneck

The bottleneck (or “code”) contains

the most compressed representation
of the input: it is both the output
layer of the encoder network and the
input layer of the decoder network.
Decoder
The decoder comprises hidden layers
with a progressively larger number of
nodes that decompress (or decode) the
encoded representation of data,
ultimately reconstructing the data back to
its original, pre-encoding form.
Checking efficiency of Autoencoder
To guage the efficiency of autoencoders we
calculate Reconstruction error/loss.
Loss is calculated by the following equation
where-
Theta- parameters of encoder
Phi- parameters of decoder
we are summing up the difference between
the original image, x`, and the
reconstructed image Fθ (gφ(x`)).
Selecting model architecture in
autoencoder
Model architecture in autoencoder is decided based on the tasks and different data types. We can
build autoencoder using any Neural Network like-
❑CNN based architecture
❑RNN based architecture
❑Or a simple vanilla feed-forward NN.

However the design of autoencoder requires certain parameters!

Autoencoder Hyperparameter
❑Code size - The size of the bottleneck determines how much the data is to be
compressed. The code size can also be used a regularization term: adjustments to code
size are one way to counter overfitting or underfitting.
❑Number of layers- The depth of the autoencoder is measured by the number of layers
in the encoder and decoder. More depth provides greater complexity, while less depth
provides greater processing speed.
Autoencoder Hyperparameter
❑Number of Nodes per layer- Generally, the number of nodes (or “neurons”) decreases
with each encoder layer, reaches a minimum at the bottleneck, and increases with each
layer of the decoder layer. The number of neurons may also vary per the nature of input
data: for example, an autoencoder dealing with large images would require more
neurons than one dealing with smaller images.
❑Loss Function- When training an autoencoder, the loss function—which measures
reconstruction loss between the output and input—is used to optimize model weights
through gradient descent during backpropagation. The ideal algorithm(s) for the loss
function depends on the task the autoencoder will be used for.
Working
Suppose we have a input image –
size (28 x 28)

Image needs to be flattened before

feeding it to neural network.
Working
❖Flattened image would be (784,)
❖This is passed in Encoder
❖The output of the encoder is then
fed to the bottleneck or latent
space which should be a reduced
version,
❖for instance, if the amount nodes
in the latent space is 8, it simply
means we have succeeded in
compressing an image of size 784 to
just 8 nodes!!!
Working
❖The decoder network then tries
to recreate the original (28 x 28)
input image from the compressed
state in the bottleneck.
❖As soon as image is reconstructed
you compare the reconstructed
image with the original image,
compute the difference, and
calculate the loss which can then be
minimized.
Using reconstruction loss for anomaly
detection!
One more application of Autoencoders is
in detecting outliers/ anomaly!
Example
Suppose an AutoEncoder is trained
on a dataset of 1000 data points
with a two-dimensional circle
distribution.
After training, the reconstruction error for each data point is calculated using MSE (Mean
Squared error) . Let’s assume a threshold for outlier detection is set at 1.5 times the standard
deviation of the reconstruction errors.
❑Mean Reconstruction Error (MSE): 0.007 (hypothetical value)
❑Standard Deviation of Reconstruction Errors: 0.003 (hypothetical value)
To detect outlier
Threshold Calculation:
❑Threshold=Mean+1.5×Standard deviation
❑Threshold=0.007+1.5×0.003
❑Threshold=0.0125

Identifying Outlier
Data points with reconstruction errors greater than 0.0125 would
be considered outliers.
The outliers are thus detected in
this case if data points
reconstruction error is greater
than the threshold!
Variants of Autoencoders
❑Denoising Autoencoder
❑Sparse Autoencoder
❑Variational Autoencoder
Denoising Autoencoder
Denoising autoencoder works on a
partially corrupted input and trains
to recover the original undistorted
image. As mentioned above, this
method is an effective way to
constrain the network from simply
copying the input and thus learn
the underlying structure and
important features of the data.
Sparse Autoencoder
In a standard autoencoder, the network
learns to encode and decode data
without any constraints on the hidden
layer's activations. Sparse autoencoders
modify this behavior by adding a
sparsity constraint, which forces the
hidden units to activate only a small
number of neurons at a time. This
encourages the network to discover
more meaningful and interpretable
features.
Sparse Autoencoder
The sparsity constraint can be enforced through various techniques:

1.L1 Regularization: Adds a penalty proportional to the absolute values of the weights.

[Link] Divergence: Measures the difference between the average activation of the hidden
neurons and a target sparsity level.
The objective function for a sparse
autoencoder can be expressed as
follows where-
•X: Input data.

•X^: Reconstructed output.

•λ: Regularization parameter.

•Penalty(s): A function that penalizes

deviations from sparsity, often
implemented using KL-divergence.
Variational Autoencoder (VAE)
❖Used for Data generation
❖Basic idea behind the VAE is that instead
of mapping an input to a fixed vector, the
input is mapped to a distribution.
❖The primary difference between AE and
VAE is that the bottleneck of the VAE is
continuous and replaced by two separate
vectors; one representing the means of
the distribution, and the other
representing the standard deviations of
the distribution.
Variational Autoencoder (VAE)
❖The loss function of the VAE is defined by
two terms, the reconstruction loss and the
regularizer which is essentially a KL
divergence between the encoder’s
distribution and the latent space.
Variational Autoencoder (VAE)
❖KL divergence-
Kulback-Leibler Divergence (D_KL for
short) is a measure of how one probability
distribution is different from the other. For
the discrete probability distribution P and
Q, the KL divergence between and P and Q
is defined as:
working
Variational autoencoder uses KL-
divergence as its loss function, the
goal of this is to minimize the
difference between a supposed
distribution and original distribution of
dataset.

Suppose we have a distribution z and

we want to generate the observation
x from it. In other words, we want to
calculate p(z∣x)
working
We can do it by following way:
p(z∣x)=p(x∣z)p(z)p(x)

But, the calculation of p(x) can be

quite difficult
Hence, we need to approximate p(z|x)
to q(z|x) to make it a tractable
distribution. To better approximate
p(z|x) to q(z|x), we will minimize the
KL-divergence loss which calculates
how similar two distributions are:
working
By simplifying, the above
minimization problem is equivalent to
the following maximization problem :

The first term represents the

reconstruction likelihood and the
other term ensures that our learned
distribution q is similar to the true
prior distribution p.
Thus our total loss consists of two
terms, one is reconstruction error and
other is KL-divergence loss:
VAE is significantly used in synthetic data
generation

Understanding Autoencoders in Unsupervised Learning
No ratings yet
Understanding Autoencoders in Unsupervised Learning
35 pages
Understanding Autoencoders in Machine Learning
No ratings yet
Understanding Autoencoders in Machine Learning
39 pages
Understanding Autoencoders in Deep Learning
No ratings yet
Understanding Autoencoders in Deep Learning
39 pages
Understanding Autoencoders in Neural Networks
No ratings yet
Understanding Autoencoders in Neural Networks
27 pages
Overview of Autoencoders
No ratings yet
Overview of Autoencoders
22 pages
Understanding Autoencoders in Deep Learning
No ratings yet
Understanding Autoencoders in Deep Learning
26 pages
Understanding Autoencoders and VAEs
100% (1)
Understanding Autoencoders and VAEs
22 pages
Understanding Autoencoders in Deep Learning
No ratings yet
Understanding Autoencoders in Deep Learning
62 pages
4.3 Auto Encoders
No ratings yet
4.3 Auto Encoders
4 pages
Understanding Autoencoders in ML
No ratings yet
Understanding Autoencoders in ML
42 pages
DL Module 4
No ratings yet
DL Module 4
34 pages
Understanding Autoencoders in Deep Learning
No ratings yet
Understanding Autoencoders in Deep Learning
16 pages
Understanding Autoencoders in Deep Learning
No ratings yet
Understanding Autoencoders in Deep Learning
248 pages
Recurrent Networks and Autoencoders
No ratings yet
Recurrent Networks and Autoencoders
53 pages
Unit 3 High Dimensional Object
No ratings yet
Unit 3 High Dimensional Object
13 pages
DL 2 Unit 4
No ratings yet
DL 2 Unit 4
19 pages
Artificial Neural Network Unit 5
No ratings yet
Artificial Neural Network Unit 5
33 pages
Autoencoder Architecture and Hyperparameters
No ratings yet
Autoencoder Architecture and Hyperparameters
16 pages
Twitter Spam Detection with Autoencoders
No ratings yet
Twitter Spam Detection with Autoencoders
26 pages
DL U4
No ratings yet
DL U4
11 pages
Denoising Autoencoders Explained
No ratings yet
Denoising Autoencoders Explained
7 pages
Understanding Autoencoders in Deep Learning
No ratings yet
Understanding Autoencoders in Deep Learning
35 pages
Understanding Autoencoders in ML
No ratings yet
Understanding Autoencoders in ML
25 pages
Understanding Autoencoders in Deep Learning
No ratings yet
Understanding Autoencoders in Deep Learning
37 pages
Understanding Autoencoders in Deep Learning
No ratings yet
Understanding Autoencoders in Deep Learning
23 pages
Understanding Autoencoders and Variants
No ratings yet
Understanding Autoencoders and Variants
37 pages
Autoencoders: Types and Applications
No ratings yet
Autoencoders: Types and Applications
31 pages
Autoencoders and Their Applications
No ratings yet
Autoencoders and Their Applications
25 pages
Module 3 DL
No ratings yet
Module 3 DL
57 pages
Understanding Autoencoders in Deep Learning
100% (1)
Understanding Autoencoders in Deep Learning
4 pages
Understanding Autoencoders: Types & Uses
No ratings yet
Understanding Autoencoders: Types & Uses
20 pages
01-Unit 3
No ratings yet
01-Unit 3
17 pages
Autoencoder Applications in Deep Learning
No ratings yet
Autoencoder Applications in Deep Learning
7 pages
Understanding Autoencoders in AI
No ratings yet
Understanding Autoencoders in AI
17 pages
Understanding Autoencoders in AI
No ratings yet
Understanding Autoencoders in AI
12 pages
AE Tutoriaal 1
No ratings yet
AE Tutoriaal 1
14 pages
Understanding Autoencoders: Types & Uses
No ratings yet
Understanding Autoencoders: Types & Uses
11 pages
Auto Encoder S
No ratings yet
Auto Encoder S
57 pages
Understanding Autoencoders for Dimensionality Reduction
No ratings yet
Understanding Autoencoders for Dimensionality Reduction
103 pages
Understanding Autoencoders: Types and Functions
No ratings yet
Understanding Autoencoders: Types and Functions
52 pages
Autoencoders and Generative Models Overview
No ratings yet
Autoencoders and Generative Models Overview
27 pages
Types of Autoencoders Explained
No ratings yet
Types of Autoencoders Explained
13 pages
Understanding Autoencoders Explained
No ratings yet
Understanding Autoencoders Explained
32 pages
Understanding Autoencoders in Deep Learning
No ratings yet
Understanding Autoencoders in Deep Learning
40 pages
Contractive Autoencoder Overview
No ratings yet
Contractive Autoencoder Overview
21 pages
Unit 4 Autoencoders
No ratings yet
Unit 4 Autoencoders
23 pages
Understanding Undercomplete Autoencoders
No ratings yet
Understanding Undercomplete Autoencoders
32 pages
Autoencoders: Applications and Architecture
No ratings yet
Autoencoders: Applications and Architecture
22 pages
Understanding Autoencoders in ML
No ratings yet
Understanding Autoencoders in ML
22 pages
Understanding Autoencoders in Deep Learning
No ratings yet
Understanding Autoencoders in Deep Learning
15 pages
Understanding Auto-Encoders in AI
No ratings yet
Understanding Auto-Encoders in AI
47 pages
AAI Module 3 Types of Auoencoder
No ratings yet
AAI Module 3 Types of Auoencoder
13 pages
Understanding Autoencoders in Neural Networks
No ratings yet
Understanding Autoencoders in Neural Networks
52 pages
Understanding Autoencoders and VAEs
No ratings yet
Understanding Autoencoders and VAEs
11 pages
Understanding Autoencoders in ML
No ratings yet
Understanding Autoencoders in ML
11 pages
Understanding Autoencoders in Deep Learning
No ratings yet
Understanding Autoencoders in Deep Learning
58 pages
Understanding Autoencoders and Their Applications
No ratings yet
Understanding Autoencoders and Their Applications
18 pages
2D Shapes CLIL Lesson Plan for Kids
No ratings yet
2D Shapes CLIL Lesson Plan for Kids
4 pages
Understanding Pointers in C Programming
No ratings yet
Understanding Pointers in C Programming
6 pages
Workshop Manual for Motorcycle Assembly
100% (1)
Workshop Manual for Motorcycle Assembly
42 pages
Formative Assessment Review: Statistics
No ratings yet
Formative Assessment Review: Statistics
6 pages
Leadership Strategies by Jocko Willink
No ratings yet
Leadership Strategies by Jocko Willink
128 pages
Grade 8 Araling Panlipunan Curriculum Map
No ratings yet
Grade 8 Araling Panlipunan Curriculum Map
15 pages
UTS Module 2: Understanding the Self
No ratings yet
UTS Module 2: Understanding the Self
21 pages
Fluid Mechanics and Thermodynamics Overview
No ratings yet
Fluid Mechanics and Thermodynamics Overview
36 pages
Fire Sprinkler System Installation Guide
No ratings yet
Fire Sprinkler System Installation Guide
68 pages
Brimstone
No ratings yet
Brimstone
119 pages
Prevention of Prompt Injection Attacks Over Financial Applications Integrated With LLM
No ratings yet
Prevention of Prompt Injection Attacks Over Financial Applications Integrated With LLM
6 pages
Is 5512 1983 PDF
No ratings yet
Is 5512 1983 PDF
17 pages
Construction Law and Material Standards
No ratings yet
Construction Law and Material Standards
23 pages
KIIT National Moot Court 2019 Summary
No ratings yet
KIIT National Moot Court 2019 Summary
27 pages
Motivations for Investing in St. Peter Plans
No ratings yet
Motivations for Investing in St. Peter Plans
15 pages
Victorian Era: Culture and Society
No ratings yet
Victorian Era: Culture and Society
15 pages
Communication Process and Ethics Overview
No ratings yet
Communication Process and Ethics Overview
5 pages
Accuracy and Reliability of Emergency Department Triage Using The Emergency Severity Index: An International Multicenter Assessment
No ratings yet
Accuracy and Reliability of Emergency Department Triage Using The Emergency Severity Index: An International Multicenter Assessment
10 pages
Bahasa dan Kekuasaan: Simbolisme Sosial
No ratings yet
Bahasa dan Kekuasaan: Simbolisme Sosial
10 pages
Applying Mendel's Principles in Genetics
No ratings yet
Applying Mendel's Principles in Genetics
62 pages
2nd PUC Statistics Passing Package
No ratings yet
2nd PUC Statistics Passing Package
32 pages
ISO 12917-1 2017 (En)
No ratings yet
ISO 12917-1 2017 (En)
54 pages
THS7374 4-Channel Video Amplifier
No ratings yet
THS7374 4-Channel Video Amplifier
42 pages
GeoGebra Bookmark Design Project
No ratings yet
GeoGebra Bookmark Design Project
10 pages
T66 Compact Track Loader Specifications
No ratings yet
T66 Compact Track Loader Specifications
1 page
RLC Circuit Differential Equations
100% (1)
RLC Circuit Differential Equations
47 pages
Understanding Community Services Types
44% (9)
Understanding Community Services Types
45 pages
Rethinking Happiness and Morality
No ratings yet
Rethinking Happiness and Morality
12 pages
Traditions in Social Theory Ian Craib Ted Benton Philosophy of Social Science The Philosophical Foundations of Social Thought 2010 Palgrave Macmillan
No ratings yet
Traditions in Social Theory Ian Craib Ted Benton Philosophy of Social Science The Philosophical Foundations of Social Thought 2010 Palgrave Macmillan
271 pages
English Teacher with AIESEC Experience
No ratings yet
English Teacher with AIESEC Experience
2 pages