0% found this document useful (0 votes)
15 views4 pages

Deep Learning Unit 2

The document provides an overview of deep learning, covering its history, probabilistic theory, backpropagation, regularization, batch normalization, VC dimensions, and comparisons between deep and shallow networks. It also discusses specific architectures like CNNs and GANs, along with semi-supervised learning techniques. Key advantages and disadvantages of various methods and applications in fields such as image recognition and NLP are highlighted.

Uploaded by

siapa11102004
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views4 pages

Deep Learning Unit 2

The document provides an overview of deep learning, covering its history, probabilistic theory, backpropagation, regularization, batch normalization, VC dimensions, and comparisons between deep and shallow networks. It also discusses specific architectures like CNNs and GANs, along with semi-supervised learning techniques. Key advantages and disadvantages of various methods and applications in fields such as image recognition and NLP are highlighted.

Uploaded by

siapa11102004
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Deep Learning: Unit - 2 (One Shot)

1. History of Deep Learning

 Definition: It is a subset of ML, focuses on algorithms inspired by the structure and function of
the brain, known as ANN (Artificial Neural Networks).

 Timeline:

o 1943: First conceptualization of artificial neurons by McCulloch.

o 1950s - 60s: Initial neural network models (e.g., perceptrons) were introduced by Frank
Rosenblatt.

o 1980s: Backpropagation algorithm was reintroduced, reviving interest in neural


networks.

o 1990s: Evolution in neural network and initiation of deep neural networks.

o 2006: Concept of Deep Belief Networks (DBNs), leading to a start of deep learning.

o 2012: Deep learning showed the potential in image recognition tasks.

o 2014: GANs (Generative Adversarial Networks) were introduced.

o 2015 - present: Widespread adoption of deep learning across various industries (e.g.,
NLP, automation, autonomous vehicles, etc.).

2. Probabilistic Theory of Deep Learning

The probabilistic theory of deep learning applies probabilistic methods to model uncertainty and deal
with incomplete or noisy data.

Key Aspects:

1. Bayesian Neural Networks: These incorporate probability distributions over the weights,
leading to more robust models.

2. Gaussian Processes: A probabilistic method used to distributions over functions.

3. Likelihood Estimation: Models are trained to maximize the likelihood of observing the training
data.

4. Markov Chain Monte Carlo (MCMC): Used to sample from distributions when direct
calculation is di icult.

5. Maximum Likelihood Estimation (MLE): A method for estimating parameters of a probabilistic


model.

 Applications: Image recognition, Speech recognition, Text generation, Autonomous driving.

 Advantages: Handles uncertainty well; can make predictions even with missing data.
 Disadvantages: Computationally expensive; requires a large amount of data for e ective
training.

3. Backpropagation and Regularization

Backpropagation

It is a supervised learning algorithm used to minimize the error in a neural network by adjusting the
weights.

 Steps:

1. Input the data.

2. Feedforward pass.

3. Error computation.

4. Backpropagation of error.

5. Updation of weights and biases.

Regularization

These techniques are used to prevent overfitting by adding a penalty term to the loss function.

 Types:

1. L1 Regularization (Lasso): Adds a penalty proportional to the absolute value of the


weights.

2. L2 Regularization (Ridge): Adds a penalty proportional to the square of weights.

 Advantages: Prevents overfitting; improves generalization of deep neural networks; helps in


reducing variances in large models.

 Disadvantages: Increases computational complexity; may underfit if not tuned properly.

4. Batch Normalization & VC Dimensions

Batch Normalization

It normalizes the activations of each layer in a neural network to improve training speed and stability.

 Working:

1. Normalize each layer activations by subtracting the mean and dividing by the standard
deviation.

2. Scale and shift the normalized output by learnable parameters.

3. Apply this during both training and testing phases.

 Advantages: Reduces internal covariate shift; speeds up training; helps mitigate overfitting.
 Disadvantages: May not work well for small datasets; adds extra computational overhead.

VC Dimensions (Vapnik-Chervonenkis Dimension)

It is a measure of the capacity of a statistical model, defined by the maximum number of points that
can be shattered (classified correctly) by the model.

 Steps:

1. Shattering: A dataset is said to be shattered if the model can perfectly classify all
possible combinations of labels for the data points.

2. VC Dimensions: The highest number of points that can be shattered defines the VC
dimension.

 Applications: Understanding model complexity, used in regularization techniques, helps in


model selection, determines generalization ability.

5. Deep Neural Networks vs. Shallow Networks

Feature Deep Neural Networks Shallow Networks

Layers Multiple hidden layers. Single or two hidden layers only.

Capacity Higher capacity for complex tasks. Lower capacity.

Training Time Longer training time. Faster training time.

Feature Extraction Hierarchical feature extraction. Limited feature extraction.

Cost High computation cost. Low computation cost.

Regularization Lower with regularization. Higher without regularization.

Non-linearity Can model complex non-linearities. Limited ability for non-linearity.

Problem Type Higher for complex problems. Better for simple problems.

Interpretability Hard to interpret. Easy to interpret.

Convergence Slower convergence. Faster convergence.


6. CNN, GAN, and Semi-supervised Learning

CNN (Convolutional Neural Network)

Specialized deep learning models designed for processing structured grid data, like images.

 Layers:

1. Convolution Layer: Apply filters to the input image to detect features (feature
mapping).

2. Pooling Layer: Perform downsampling to reduce dimensionality.

3. Fully Connected Layer: Connect every neuron to every neuron in the next layer (also
called dense layer).

4. Flatten Layer: Reduced feature map presentation and output probabilities for
classification.

 Applications: Image recognition, Object detection, Video classification, Medical image


analysis.

GAN (Generative Adversarial Networks)

A class of ML models where two networks (Generator and Discriminator) are trained together,
competing against each other.

 Steps:

1. Generator: Generates fake data based on random input (noise).

2. Discriminator: Tries to di erentiate between real and generated data.

3. Both networks improve over time as they try to outsmart each other.

 Applications: Image generation, Style transfer, Video generation, Data augmentation.

Semi-supervised Learning

Uses both labeled and unlabeled data to train models, providing a middle ground between supervised
and unsupervised learning.

 Example: Improving image recognition using a few labeled images and many unlabeled ones.

 Advantages: Reduces the need for labeled data; can improve performance when labeled data
is scarce.

 Disadvantages: Requires careful selection of unlabeled data; di icult to ensure accuracy


without proper labeling.

 Applications: Image classification with limited data, Speech recognition, NLP, Medical
diagnosis.

You might also like