0% found this document useful (0 votes)

8 views5 pages

Computer Vision and Deep Learning Overview

deep learningdeep learning

Uploaded by

kmedo8080966

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views5 pages

Computer Vision and Deep Learning Overview

deep learningdeep learning

Uploaded by

kmedo8080966

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

lOMoARcPSD|15872722

I Introduction

Computer Vision: We’re trying to make sure that machines are learning to see in a similar way that humans
are doing. That is why we need Machine Learning methods, to get there. CV is the center for robotics so
that you understand the environment and what it does for you. There’s a lot of images/videos processing
done by CV as well.

I.1 The history of Computer Vision

Hubel and Wiesel Experiment

Hubel and Wiesel (neurobiologists) experimented on cats’ brains by putting electrodes in it and recording
them while the cat was being shown stimuli through a screen (mostly edges). They were able to find out
that visual cortex cells are sensitive to the orientation of edges, yet they were insensitive to the position of
the edges. Something that we will see later in convolutional networks.

I.2 The summer vision project 1966

They tried to construct a significant part of a visual system, and it was the time when pattern recognition
was coined.
CV is a very core element to other areas as well such as robotics, NLP, optics and image processing, algo
optimization, neuroscience, AI and ML.

I.3 Image classification

Previously DL was not used for Image classification yet it became popular later. Earlier they did prepro-
cessing (i.e. normalizing the colors of images), then came the feature descriptor which funcioned kind of
the same as the Hubel Wiesel Experiment in the sense that certain properties were not important such as
position of edges.
Different types of feature descriptor are HAAR, HOG, SIFT, SURF. In order to get to these feature descriptor
they had to hand engineer it, since most are gradient based. After that you have aggregators such as svm
rf, ann etc which would aggreagate the features and give the label.
Instead of feature extraction+accumulation we have a magic box that does that for us. That magic box is
deep learning. We do not have the hand engineer the feature descriptors. We are letting a data set decide
what the best possible descriptor might be that will give us the best results.
Image Classification Issues:

• Occlussion.
• Background cluttering: Background and foreground (object) similar colors
• Representation: Ex: cat drawing vs cat photo

1
Downloaded by Eng Esraa ([Link]@[Link])
lOMoARcPSD|15872722

I.4 History of Deep Learning

Started in 1940 with the electronic brain. Each cell has a certain pattern in them. They accumulated
weights/impulses and eventually made a decision.
1960 we saw the perceptron. Instead of fixed weights, we could learn the weights. We showed the system a
couple of example and we hope to essentially learn certain parameters of these perceptrons. We learn the
feature extraction (weights) and the threshold of learning. This was all hardwired.
Then we had Adaline (the golden age of deep learning). There was a lot of hype and progess being made.
Then in 1969, people realized the problems with perceptrons, specifically the xor problem. The problem
was that a linear model (a single perceptron) cannot separate the two classes. This era was called the AI
winter.
In 1986, the multi-layer perceptron came to light. We have several layers that can be trained (optimized for
the weights of the multi-layer perceptron). This is called backpropagation. Gradient based method.
In 1995 there was the SVM. Since it was successful, it put a halt to deep learning.
In 2006, Hinton and Ruslan developed Deep Belief Networks. The idea of pretraining came around. So you
train an nn and then you train it again for a specific task. The idea of pretraining is still one of the most
relevant today (for example transfer learning with ImageNet weights). Despite of this, neural networks were
still not a mainstream method.
In 2012 : the AlexNet architecture (see Section X.2) was the first neural network based architecture that
won the ImageNet competition based on the lowest top 5-error.
Definition of top 5 error: Give me an image, ask the method what class it is and see if the top five predictions
include the correct class.

I.5 What made this [Deep Learning] possible?

• Big Data: When we have big data, models learn where to learn from and we have so much more data
today then we did back then. The datasets are also online.
• Better Hardware: Not only has the data changed, the hardware has changes as well (i.e. GPU).
Hardware was developed for the rendering of images in games, and it is now used as well for deep
learning, to train models faster.
• Models are more complex

I.6 Different Tasks in DL

• Object Detection
• Self-Driving Cars
• Gaming (i.e. AlphaGO, AlphaStar)
• Machine Translation
• Automated Text Generation (ChatBots)
• Healthcare, cancer Detection

2
Downloaded by Eng Esraa ([Link]@[Link])
lOMoARcPSD|15872722

II Machine Learning Basics

Unsupervised Learning
Supervised Learning An underlying assumption is that train and test data come from the same distribu-
tion.
Nearest neighbor Model: Supervised learning method, labels the sample based on the majority label of
its neighboring samples. The hyper-parameters to be tweaked in KNN are: k, L1 or L2 distances.
Cross Validation: Split the data in K folds
Decision Boundaries are boundaries where the data is separated into classes.
The pros and cons of using linear decision boundaries:

+ It’s very easy to implement and derive

+ It’s easy to find the hyperparameters
- The distribution must be clearly separated
- Harder to use for multi classes (?)

Linear Regression is a supervised learning method that finds a linear model that explains a target y given
inputs x with weights ¹:

d
ŷi = xij ¹j
j=1

and the prediction looks like:

d
ŷi = ¹0 + xij ¹j =⇒ ŷ = X¹
j=1

xij are the features; ¹ are the weights (model parameters). ¹0 is a bias.
Mean squared error:
n n
1 1 2
J(¹) = (ŷi − yi ) = (xi ¹ − yi )2
n i=1
n i=1

Matrix notation: min J(¹) = min(X¹ − y)T (X¹ − y)

¹ ¹
This loss function is convex, thus have a closed form solution.¹ = (X T X)−1 X T y = X y

II.1 Maximum Likelihood

Find the parameter values that maximize the likelihood of making the observations given by the parameters.

3
Downloaded by Eng Esraa ([Link]@[Link])
lOMoARcPSD|15872722

II.2 Logistic Regression

¹M L = arg max pmodel (Y |X, ¹)

¹
n
= arg max pmodel (yi |xi , ¹)
¹
i=1
n
= arg max log pmodel (yi |xi , ¹)
¹
i=1

MLE assumes that the training samples are independent and generated by the same distribu-
tion.
What shape does our probability distribution have? Assuming Gaussian distribution: yi = N (xi ¹, Ã 2 ) =
xi ¹ + N (0, Ã 2 )
1 − 12 (yi −xi ¹)2
p(yi |xi , ¹) = e 2σ
(2ÃÃ 2 )

then after more matrix calculations we get:

n 1
¹M L = − log(2ÃÃ 2 ) − 2 (y − X¹)T (y − X¹)
2 2Ã
¹M L = (X T X)−1 X T y

So the MLE is the same as the least squares estimate we found previously.

II.2 Logistic Regression

Sigmoid function:

1
Ã(x) =
1 + e−x

Probability of a binary output:

n
p(y|X, ¹) = yˆi yi (1 − yˆi )(1−yi )
i=1

ŷi = Ã(xi ¹)

Maximum Likelihood Estimate:

θM L = arg max log p(y | X, θ)

θ
N
= arg max log yî yi (1 − yî )1−yi
θ
i=1
N
= arg max yi log(yî ) + (1 − yi ) log(1 − yî )
θ
i=1

This is called binary cross-entropy loss or BCE.

In the more general case (number of classes > 2), the cross entropy loss can be written as :

4
Downloaded by Eng Esraa ([Link]@[Link])
lOMoARcPSD|15872722

II.2 Logistic Regression

N C
L(ŷi , yi ) = yi,j log yˆi,j
i=1 j=1

n
Cost function (mean of losses for all samples): C(¹) = − n1 i=1 L(ŷi , yi ), optimize via gradient descent (no
closed form)

5
Downloaded by Eng Esraa ([Link]@[Link])

Common questions

Supervised learning enhances computer vision applications by providing labeled data to train models to predict or classify new inputs accurately, essential in tasks like image classification and object detection. Unsupervised learning, on the other hand, identifies patterns and structures in data without labels, proving useful in feature learning and anomaly detection. Together, they enable more comprehensive and adaptive computer vision solutions by combining labeled insights with previously unseen data structures .

Decision boundaries in supervised learning models are critical for classification tasks as they define the regions where a model assigns different output classes. In linear regression, a type of supervised learning, decision boundaries help to establish how different sets of input features separate the output space, especially when used for regression tasks. However, for classification, linear decision boundaries can be limiting when feature distributions are not linearly separable, necessitating more complex non-linear models to achieve better separation and accuracy .

Edge orientation is crucial for feature descriptors because it captures significant structural information within images that can be invariant to transformations such as translation, making it a robust feature for classification. Feature descriptors like SIFT and HOG exploit edge orientation to represent and compare images effectively. This concept, initially identified by Hubel and Wiesel, informs the foundational architecture of feature extraction that precedes deep learning and influences the feature representation in convolutional neural networks .

Big data is instrumental in the success of deep learning models as it provides the extensive datasets required for training highly accurate models by exposing them to diverse examples. This abundance of data enables models to learn intricate patterns and generalize better across diverse applications, significantly improving performance compared to when data was scarce, which limited learning capabilities and model generalization .

The development of the multi-layer perceptron was crucial as it introduced the concept of hidden layers, which enabled the modeling of more complex functions than was possible with single-layer perceptrons. This design overcame the XOR problem, a limitation of linear models that could not classify data that was not linearly separable. It facilitated the use of nonlinear activation functions and backpropagation to effectively train networks of increased complexity .

Adaline introduced the concept of adaptive linear neurons with adjustable weights, setting the stage for learning algorithms that could update parameters based on input data, which is foundational in modern neural networks. This influence extended to the development of non-linear multilayered perceptrons and the concept of backpropagation, critical for training complex networks by minimizing error through weight adjustments .

The development of GPUs has revolutionized the progression of machine and deep learning by providing the computational power necessary to handle large-scale computations efficiently. GPUs enable parallel processing, which speeds up training of complex models, such as large neural networks, and has facilitated the feasibility of using large datasets for training, contributing to the efficiency and scalability of modern deep learning applications .

Transfer learning is significant in modern deep learning architectures because it allows pre-trained models on large datasets, like ImageNet, to be adapted for specific tasks with less labeled data, reducing computational costs and training time. By leveraging learned features from one task, such models improve performance on related tasks without rebuilding models from scratch, which is especially valuable in domains with limited data .

Logistic regression utilizes maximum likelihood estimation (MLE) to find parameter values that maximize the likelihood of observing the given dataset. By adjusting the model's parameters to maximize this likelihood, logistic regression fits the model to best represent the probability distribution of the observed data, enhancing its predictive accuracy. The MLE approach matches the positive correlations between inputs and outputs within a logistic regression framework to improve classification performance .

The Hubel and Wiesel experiment demonstrated that visual cortex cells in cats are sensitive to the orientation of edges but not to their position. This finding is similar to how convolutional neural networks (CNNs) operate, as CNNs use convolutional layers to detect features in images irrespective of their position. The experiment thus laid foundational insights for feature detection in CNNs, influencing their structure in detecting edges as primary features.

Introduction to Deep Learning Concepts
No ratings yet
Introduction to Deep Learning Concepts
17 pages
Deep Learning Unit 2
No ratings yet
Deep Learning Unit 2
19 pages
Unit 1
No ratings yet
Unit 1
85 pages
Deep Learning Overview and Techniques
No ratings yet
Deep Learning Overview and Techniques
74 pages
History and Theory of Deep Learning
No ratings yet
History and Theory of Deep Learning
18 pages
Understanding Deep Learning Concepts
No ratings yet
Understanding Deep Learning Concepts
42 pages
Machine Learning: Models and History Guide
No ratings yet
Machine Learning: Models and History Guide
14 pages
Understanding Machine Learning Basics
No ratings yet
Understanding Machine Learning Basics
45 pages
Deep Learning: History and Techniques
No ratings yet
Deep Learning: History and Techniques
27 pages
Deep Learning for Natural Language Processing
No ratings yet
Deep Learning for Natural Language Processing
78 pages
Evolution of Deep Learning Milestones
No ratings yet
Evolution of Deep Learning Milestones
50 pages
Deep Learning in Computer Vision
No ratings yet
Deep Learning in Computer Vision
50 pages
Understanding Deep Learning Basics
No ratings yet
Understanding Deep Learning Basics
48 pages
History and Basics of AI/ML/DL
No ratings yet
History and Basics of AI/ML/DL
6 pages
Deep Learning SEM
No ratings yet
Deep Learning SEM
29 pages
Introduction to Deep Learning Concepts
No ratings yet
Introduction to Deep Learning Concepts
31 pages
Deep Learning: History and Successes
No ratings yet
Deep Learning: History and Successes
71 pages
Machine Learning & Probabilistic Modeling History
No ratings yet
Machine Learning & Probabilistic Modeling History
13 pages
Overview of Deep Learning Techniques
No ratings yet
Overview of Deep Learning Techniques
20 pages
Understanding Perceptrons and Deep Learning
No ratings yet
Understanding Perceptrons and Deep Learning
39 pages
Deep Learning Exam Questions Guide
No ratings yet
Deep Learning Exam Questions Guide
32 pages
Deep Learning Overview and Applications
No ratings yet
Deep Learning Overview and Applications
31 pages
Deep Learning UNIT-1
No ratings yet
Deep Learning UNIT-1
26 pages
Deep Neural Networks Overview
No ratings yet
Deep Neural Networks Overview
72 pages
Deep Learning Overview and Concepts
No ratings yet
Deep Learning Overview and Concepts
18 pages
Deep Learning and Neural Networks Overview
No ratings yet
Deep Learning and Neural Networks Overview
48 pages
Deep Learning Techniques Overview
No ratings yet
Deep Learning Techniques Overview
123 pages
Machine Learning: Linear Models Overview
No ratings yet
Machine Learning: Linear Models Overview
34 pages
IF4071 Deep Learning Course Overview
No ratings yet
IF4071 Deep Learning Course Overview
188 pages
Introduction to Deep Learning Concepts
No ratings yet
Introduction to Deep Learning Concepts
7 pages
Deep Learning Techniques Overview
No ratings yet
Deep Learning Techniques Overview
43 pages
Deep Learning Overview and History
No ratings yet
Deep Learning Overview and History
30 pages
Overview of Machine and Deep Learning
No ratings yet
Overview of Machine and Deep Learning
86 pages
Deep Learning Lecture Notes 2020-2021
No ratings yet
Deep Learning Lecture Notes 2020-2021
151 pages
The Little Book of Deep Learning
No ratings yet
The Little Book of Deep Learning
140 pages
The Little Book of Deep Learning
100% (1)
The Little Book of Deep Learning
140 pages
Unit 3
No ratings yet
Unit 3
16 pages
Introduction To Deep Learning
No ratings yet
Introduction To Deep Learning
3 pages
Deep Neural Network 1
No ratings yet
Deep Neural Network 1
48 pages
Deep Learning: History and Techniques
No ratings yet
Deep Learning: History and Techniques
53 pages
Visual Introduction to Deep Learning
No ratings yet
Visual Introduction to Deep Learning
53 pages
Deep Learning for Image and Text Classification
No ratings yet
Deep Learning for Image and Text Classification
42 pages
Machine Learning and Deep Learning Overview
No ratings yet
Machine Learning and Deep Learning Overview
21 pages
Deep Learning Fundamentals Explained
100% (1)
Deep Learning Fundamentals Explained
121 pages
Deep Learning Fundamentals Overview
100% (3)
Deep Learning Fundamentals Overview
207 pages
Deep Learning Fundamentals Overview
100% (1)
Deep Learning Fundamentals Overview
229 pages
AI and Machine Learning Concepts Guide
No ratings yet
AI and Machine Learning Concepts Guide
328 pages
Introduction+to+Neural+Networks+ +Lecture+Slides+Part+1
No ratings yet
Introduction+to+Neural+Networks+ +Lecture+Slides+Part+1
36 pages
Deep Learning History and Evaluation Guide
No ratings yet
Deep Learning History and Evaluation Guide
4 pages
Wa0056.
No ratings yet
Wa0056.
9 pages
Deep Learning Overview and History
No ratings yet
Deep Learning Overview and History
33 pages
Overview of the Deep Learning Revolution
No ratings yet
Overview of the Deep Learning Revolution
35 pages
Introduction to Machine Learning Basics
No ratings yet
Introduction to Machine Learning Basics
100 pages
Understanding Deep Learning Basics
No ratings yet
Understanding Deep Learning Basics
8 pages
Deep Learning vs. Machine Learning Explained
No ratings yet
Deep Learning vs. Machine Learning Explained
19 pages
Deep Learning: Mimicking Human Brain
No ratings yet
Deep Learning: Mimicking Human Brain
100 pages
History and Trends in Deep Learning
No ratings yet
History and Trends in Deep Learning
20 pages
VGG16 for Brain Tumor Classification
No ratings yet
VGG16 for Brain Tumor Classification
7 pages
Understanding Loss Functions in ML
No ratings yet
Understanding Loss Functions in ML
14 pages
Reversible Column Networks for Vision Tasks
No ratings yet
Reversible Column Networks for Vision Tasks
24 pages
Understanding Data Model Utility and Information
No ratings yet
Understanding Data Model Utility and Information
7 pages
Machine Learning vs Deep Learning Guide
No ratings yet
Machine Learning vs Deep Learning Guide
29 pages
Logistic Regression Overview in NLP
No ratings yet
Logistic Regression Overview in NLP
85 pages
Probability vs Likelihood in ML
No ratings yet
Probability vs Likelihood in ML
20 pages
ReFixMatch: Enhancing SSL with Pseudo-Labels
No ratings yet
ReFixMatch: Enhancing SSL with Pseudo-Labels
11 pages
Bottom-Up Parsing in NLP Explained
No ratings yet
Bottom-Up Parsing in NLP Explained
21 pages
Basics of Neural Networks Explained
No ratings yet
Basics of Neural Networks Explained
70 pages
L 8 Logistic and Softmax Regression - JAMEEL-Beamer Longer Version
No ratings yet
L 8 Logistic and Softmax Regression - JAMEEL-Beamer Longer Version
62 pages
Mathematics For Machine Learning - 2
No ratings yet
Mathematics For Machine Learning - 2
11 pages
Logistic Regression for Wine Classification
No ratings yet
Logistic Regression for Wine Classification
10 pages
AI-Based IDS for V2V Communication
No ratings yet
AI-Based IDS for V2V Communication
9 pages
AI-Driven Automation for Die Inspection
No ratings yet
AI-Driven Automation for Die Inspection
17 pages
Introduction to Deep Learning Concepts
No ratings yet
Introduction to Deep Learning Concepts
18 pages
Logistic Regression Overview and Applications
No ratings yet
Logistic Regression Overview and Applications
22 pages
Neural Networks: Basics and Perceptron
No ratings yet
Neural Networks: Basics and Perceptron
19 pages
Machine Learning Math Roadmap Guide
No ratings yet
Machine Learning Math Roadmap Guide
10 pages
Curriculum Domain Adaptation for Segmentation
No ratings yet
Curriculum Domain Adaptation for Segmentation
12 pages
QuantU-Net: Efficient Medical Imaging
No ratings yet
QuantU-Net: Efficient Medical Imaging
13 pages
Cost vs Activation Functions in Neural Networks
No ratings yet
Cost vs Activation Functions in Neural Networks
42 pages
Understanding Cross-Entropy Loss
No ratings yet
Understanding Cross-Entropy Loss
27 pages
Mean Squared Error in Neural Networks
No ratings yet
Mean Squared Error in Neural Networks
6 pages
AI Concepts Illustrated in 100 Images
No ratings yet
AI Concepts Illustrated in 100 Images
104 pages
Label Smoothing and Label Noise Mitigation
No ratings yet
Label Smoothing and Label Noise Mitigation
11 pages
EE782 Machine Learning Exam Questions
No ratings yet
EE782 Machine Learning Exam Questions
3 pages
Deep Learning Neural Networks Overview
No ratings yet
Deep Learning Neural Networks Overview
47 pages
Statistical Learning: Key Concepts Explained
No ratings yet
Statistical Learning: Key Concepts Explained
8 pages
COMP 4211 - Machine Learning Homework 1: Due Date: See Course Webpage
No ratings yet
COMP 4211 - Machine Learning Homework 1: Due Date: See Course Webpage
3 pages

Computer Vision and Deep Learning Overview

Uploaded by

Computer Vision and Deep Learning Overview

Uploaded by

lOMoARcPSD|15872722

I.1 The history of Computer Vision

Hubel and Wiesel Experiment

I.2 The summer vision project 1966

I.3 Image classification

I.4 History of Deep Learning

I.4 History of Deep Learning

I.5 What made this [Deep Learning] possible?

I.6 Different Tasks in DL

II Machine Learning Basics

+ It’s very easy to implement and derive

and the prediction looks like:

Matrix notation: min J(¹) = min(X¹ − y)T (X¹ − y)

II.1 Maximum Likelihood

II.2 Logistic Regression

¹M L = arg max pmodel (Y |X, ¹)

then after more matrix calculations we get:

II.2 Logistic Regression

Probability of a binary output:

Maximum Likelihood Estimate:

θM L = arg max log p(y | X, θ)

This is called binary cross-entropy loss or BCE.

II.2 Logistic Regression

Common questions

Compare the roles of supervised and unsupervised learning in enhancing computer vision applications.

Evaluate the role of decision boundaries in supervised learning models, particularly in relation to linear regression.

Why is the concept of edge orientation, as identified by Hubel and Wiesel, important for feature descriptors in image classification?

What role does big data play in the success of deep learning models today?

Why was the development of the multi-layer perceptron significant in overcoming the limitations of early perceptrons?

In what ways did the introduction of Adaline in the 'golden age of deep learning' influence modern neural network designs?

How has the development of hardware like GPUs influenced the progression of machine learning and deep learning models?

Discuss the significance of transfer learning in modern deep learning architectures.

Explain how machine learning models like logistic regression utilize the maximum likelihood estimation to enhance prediction accuracy.

How did the Hubel and Wiesel experiment influence the development of convolutional neural networks?

You might also like