0% found this document useful (0 votes)
22 views49 pages

Deep Generative Models Overview

The document provides an overview of deep generative models, focusing on unsupervised learning techniques such as PCA, auto-encoders, VAEs, and GANs. It discusses the objectives and applications of unsupervised learning, including density estimation, clustering, and feature learning. Additionally, it compares GANs and VAEs, highlighting their strengths and weaknesses in generating new samples from training data.

Uploaded by

imran khan
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views49 pages

Deep Generative Models Overview

The document provides an overview of deep generative models, focusing on unsupervised learning techniques such as PCA, auto-encoders, VAEs, and GANs. It discusses the objectives and applications of unsupervised learning, including density estimation, clustering, and feature learning. Additionally, it compares GANs and VAEs, highlighting their strengths and weaknesses in generating new samples from training data.

Uploaded by

imran khan
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Deep Generative Models

Shenlong Wang
● Why unsupervised learning?

● Old-school unsupervised learning

Overview ○ PCA, Auto-encoder, KDE, GMM

● Deep generative models

○ VAEs, GANs
Unsupervised Learning
● No labels are provided during training
● General objective: inferring a function to describe hidden structure from
unlabeled data
○ Density estimation (continuous probability)
○ Clustering (discrete labels)
○ Feature learning / representation learning (continuous vectors)
○ Dimension reduction (lower-dimensional representation)
○ etc.
Why Unsupervised Learning?
● Density estimation: estimate the probability density function p(x) of a random
variable x, given a bunch of observations {X1, X2, ...}

2D density estimation of
Stephen Curry’s
shooting position

Credit: BallR
Why Unsupervised Learning?
● Clustering: grouping a set of input {X1, X2, ...} in such a way that objects in
the same group (called a cluster) are more similar

Clustering analysis of Hall-of-fame players in NBA

Credit: BallR
Why Unsupervised Learning?
● Feature learning: a transformation of raw data input to a representation that
can be effectively exploited in machine learning tasks

2D topological visualization given the input how similar players


are with regard to points, rebounds, assists, steals, rebounds,
blocks, turnovers and fouls

Credit: Ayasdi
Why Unsupervised Learning?
● Dimension reduction: reducing the number of random variables under
consideration, via obtaining a set of principal variables

Principle component analysis over players trajectory data

Credit: Bruce, Arxiv 2016


Principle Component Analysis (PCA)
An algorithm that conducts dimension reduction

Intuition:

● Finds the lower-dimension projection that minimizes


reconstruction error
● Keep the most information (maximize variance)

See more details in Raquel’s CSC411 slides:


[Link]
Principle Component Analysis (PCA)
An algorithm that conducts dimension reduction

Intuition:

● Finds the lower-dimension projection that minimizes


reconstruction error
● Keep the most information (maximize variance)

Algorithm:

● Conduct eigen decomposition


● Find K-largest eigenvectors
● Linear projection with the matrix composed of K
eigenvectors
See more details in Raquel’s CSC411 slides:
[Link]
Auto-encoder
A neural network that the output is the input itself.

Intuition:

● A good representation should keep the information well (reconstruction error)


● Deep + nonlinearity might help enhance the representation power
Auto-encoder
A neural network that the output is the input itself.

Intuition:

● A good representation should keep the information well (reconstruction error)


● Deep + nonlinearity might help enhance the representation power

Credit: LeCun
Learnt representation
Auto-encoder
A neural network that the output is the input itself.

10-dimensional Auto-encoder feature embedding based on players shooting tendency

Credit: Wang et al. 2016 Sloan Sports Conference


Kernel Density Estimation (KDE)
A nonparametric way to estimate the probability density function of a random variable

Intuition:

● Point with more neighbouring samples have higher density


● Smoothed histogram, centered at data point

Kernel function, measures the similarity


Credit: Wikipedia
Kernel Density Estimation (KDE)
A nonparametric way to estimate the probability density function of a random variable

Applications:

● Visualization
● Sampling

Shooting heat map of Lamarcus Aldridge


2015-2016. Credit: Squared Statistics
Generative models
Task: generate new samples follows the same probabilistic distribution of a given a training dataset
Generative models
Task: generate new samples follows the same probabilistic distribution of a given a training dataset
Generative models
Task: generate new samples follows the same probabilistic distribution of a given a training dataset

Training samples Generated samples


Credit: Kingma

Note: sometimes it’s fine if we cannot estimate the explicit form of p(x), since it might be over complicated
Variational Auto-encoder (VAE)
Intuition: given a bunch of random variables that can be sampled easily, we can generate random
samples following other distributions, through a complicated non-linear mapping x = f(z)

Image Credit: Doersch 2016


Variational Auto-encoder (VAE)
Intuition: given a bunch of random variables that can be sampled easily, we can generate some new
random samples through a complicated non-linear mapping x = f(z)

Image Credit: Doersch 2016


Variational Auto-encoder (VAE)
Intuition: given a bunch of random variables, we can generate some new random samples through a
complicated non-linear mapping x = f(z)

Gaussian
NN

Image Credit: Doersch 2016


Variational Auto-encoder (VAE)
You can consider it as a decoder!

Decoder
network

Code z Gaussian
parameters
Variational Auto-encoder (VAE)
How do we learn the parameters?

Decoder
network

Code z Gaussian
parameters
Variational Auto-encoder (VAE)
Graphical model

Image Credit: Doersch 2016


Variational Auto-encoder (VAE)
Learning objective: maximize the log-probability

many sampled z will have a close-to-zero p(x|z)

Quiz: Why not doing this?


Image Credit: Doersch 2016
Variational Auto-encoder (VAE)
Learning objective: maximize variational lower-bound

Variational lower-bound

Quiz: How to choose a good proposal distribution?


Proposal distribution
Variational Auto-encoder (VAE)
Learning objective: maximize variational lower-bound

Variational lower-bound

Proposal distribution
Quiz: How to choose a good proposal distribution?

● Easy to sample
● Differentiable
● Given a training sample X, the sampled z is likely to have a non-zero p(x|z)
Variational Auto-encoder (VAE)
Learning objective: maximize variational lower-bound

Answer: Another neural network + Gaussian to approximate the posterior!


Variational Auto-encoder (VAE)
Learning objective: maximize variational lower-bound

Reconstruction error: Prior:

● Training samples have higher probability ● Proposal distribution should be like Gaussian
Variational Auto-encoder (VAE)
Learning objective: maximize variational lower-bound

● KL-Divergence: closed-form and differentiable if


both are Gaussians
● Reconstruction error: approximate by just
sampling one z

Computation graph
Credit: Doersch
Variational Auto-encoder (VAE)
Why it is the variational lower-bound?

Jenson inequality

Kingma et al. 2014


Variational Auto-encoder (VAE)
The whole learning structure
KL-Divergence

Reconstruction Loss

Encoder Decoder
network network

Input Image x Code z Reconstruction


Variational Auto-encoder (VAE)
Results

Kingma et al. 2014


Generative Adversarial Network (GAN)

Generator

Code z Generated
Image
Generative Adversarial Network (GAN)

Generator

Fake

Code z Generated Image Discriminator


Real

Training Image
Generative Adversarial Network (GAN)
Intuitions

Crook

Google
Generative Adversarial Network (GAN)
Intuitions

Generator

Teller

Google
Generative Adversarial Network (GAN)
Intuitions

Crook

Teller

Google
Generative Adversarial Network (GAN)
Intuitions:

● Generator tries the best to cheat


the discriminator by generating Generator
more realistic images
Fake
Code z Generated Image Discriminator Real

● Discriminator tries the best to


distinguish whether the image is
generated by computers or not
Training Image
Generative Adversarial Network (GAN)
Objective function:

For each iteration:

● Sample a mini-batch of fake images and true images


● Update G using back-prop
● Update D using back-prop

Very difficult to optimize:

● Min-max problem: finding a saddle point instead of a local optimum, unstable


Generative Adversarial Network (GAN)

Generator

Code z Generated Image Discriminator Cross-Entropy

Training Image
GANs for face and bedroom

Credit: Denton
GANs for Japanese Anime

Credit: Radford
GAN for videos

Credit: Vondrick
Generative Adversarial Network (GAN)
Extensions:

● DCGANs: some hacks that work well


● LAPGANs: coarse-to-fine conditional generation through Laplacian pyramids
● f-GANs: more general GANs with different loss other than cross-entropy
● infoGANs: additional objective that maximize mutual-information between the latent and the sample
● EBGANs: Discriminative as energy functions
● GVMs: using GANs as an energy term for interactive image manipulation
● Conditional GANs: not random z, instead z is some data from other domain
● ...
Generative Adversarial Network (GAN)
Hacks:

● How to train a GAN?


● 17 hacks that make the training work.
● [Link]
GANs vs VAEs
GANs:

● High-quality visually appealing result


● Difficult to train
● The idea of adversarial training can be applied in many other domains

VAEs:

● Easy to train
● Blurry result due to minimizing the MSE based reconstruction error
● Nice probabilistic formulation, easy to introduce prior
Demos
VAEs:

● [Link]

GANs:

● [Link]
● [Link]
References
[1] Goodfellow and Bengio, “Deep learning ” 2016
[2] Goodfellow, Ian, et al. "Generative adversarial nets." Advances in neural information processing systems. 2014.
[3] Kingma, Diederik P., and Max Welling. "Auto-encoding variational bayes." arXiv preprint arXiv:1312.6114 (2013).
[4] Radford, Alec, Luke Metz, and Soumith Chintala. "Unsupervised representation learning with deep convolutional generative adversarial
networks." arXiv preprint arXiv:1511.06434 (2015).
[5] Nowozin, Sebastian, Botond Cseke, and Ryota Tomioka. "f-GAN: Training generative neural samplers using variational divergence
minimization." Advances in Neural Information Processing Systems. 2016.
[6] Denton, Emily L., Soumith Chintala, and Rob Fergus. "Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks."
Advances in neural information processing systems. 2015.
[7] Chen, Xi, et al. "Infogan: Interpretable representation learning by information maximizing generative adversarial nets." Advances in Neural
Information Processing Systems. 2016.
[8] Zhao, Junbo, Michael Mathieu, and Yann LeCun. "Energy-based generative adversarial network." arXiv preprint arXiv:1609.03126 (2016).
[9] Doersch, Carl. "Tutorial on variational autoencoders." arXiv preprint arXiv:1606.05908 (2016).
[10] Wang, K-C., and Richard Zemel. "classifying NBA offensive plays using neural networks." MIT Sloan Sports Analytics Conference, 2016.
[11] Wikipedia “Kernel density estimation”
[12] Wikipedia “Principal component analysis”
[13] Wikipedia “Autoencoder”
Thanks

You might also like