0% found this document useful (0 votes)

45 views63 pages

Deep Learning Course Introduction Slides

This document provides an introduction to deep learning through a lecture given by Dr. Sugata Ghosal. It discusses the history and development of neural networks from early associationist models of cognition to modern deep learning techniques. Some key points covered include: - Early neural network models like the perceptron that were inspired by the brain's interconnected neurons. - Breakthroughs achieved by neural networks in applications like image recognition, natural language processing, and more. - The objectives of the course, which are to understand neural network models, design and implement networks, and explore applications. - An overview of early neural network research and the development of the multi-layer perceptron capable of learning complex patterns and

Uploaded by

ABI RAJESH GANESHA RAJA

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

45 views63 pages

Deep Learning Course Introduction Slides

Uploaded by

ABI RAJESH GANESHA RAJA

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Deep Learning

Dr. Sugata Ghosal

BITS Pilani [Link]@[Link]
Pilani Campus
BITS Pilani
Pilani Campus

Lecture No. 1| Introduction

Time: 11 AM – 1 PM
Date:07/05/2022
These slides are assembled by the instructor with grateful acknowledgement of the many
others who made their course materials freely available online.
Agenda

• Introduction
• Course Objectives and Logistics
• Introduction to Perceptron and MLP
• Approximation Capabilities
• Characteristics of Deep Learning
Neural Networks are taking over!

• Neural networks have become one of the

major thrust areas recently in various pattern
recognition, prediction, and analysis problems

• In many problems they have established the

state of the art
– Often exceeding previous benchmarks by large
margins
Breakthroughs with neural
networks
Breakthrough with neural
networks
Image segmentation and
recognition
Image recognition

[Link]
Breakthroughs with neural
networks
Success with neural
networks

• Captions generated entirely by a neural

network
Successes with neural
networks

• And a variety of other problems:

– From art to astronomy to healthcare..
– and even predicting stock markets!
Objectives of this course
• Understanding neural networks
• Comprehending the models that do the previously
mentioned tasks
– And maybe build them
• Design, build and train networks for various tasks

• You will not become an expert in one course

Course objectives: Broad level

Deep Dive into Artificial Neural Networks

• Concepts
– Types of neural networks and underlying ideas
– Learning in neural networks
• Training, concepts, practical issues
– Architectures and applications

• Practical
– Familiarity with training and parameter tuning
– Implement various neural network architectures
• Overall: Set you up for further work in your area
Course learning objectives:
Topics
• Basic network formalisms (for classification and
prediction):
– Multi-Layer Perceptron (MLP)
– Convolutional networks (CNN)
– Recurrent networks (RNN)
• Some advanced formalisms (for creation)
– Generative models: VAEs
– Adversarial models: GANs
• Applications we will touch upon:
– Computer vision: recognizing images
– Text processing: modelling and generating language
– ….
Reading
• List of books on Canvas Course Page
• Primary: [Link]
• “Deep Learning”, Goodfellow, Bengio, Courville

• Reference: [Link]
learning-with-python
• “Deep Learning with Python”, Francois Chollet.

• Additional reading material will be posted on

Canvas, if needed
Logistics
• Most relevant info on Canvas
– Handout
– Schedule of Webinars, Quiz, Assignments, ….
– Lecture Slides
– Lab Sheets
– 3 quizzes, best 2 scores will be taken
– Two Assignments
– Quiz, one assignment before midsem
– One assignment after midsem
– submissions beyond deadline will be deducted some marks / day
(unless medical emergencies)
– Programming using Python, Keras / Tensorflow
Webinars (3-4)

• Held during evenings (around 7:30 PM)

• Will cover details of lab sheet, if
needed, and basic exercises
– Important if you wish to get the maximum out of the course
Questions?

• Please post on Discussions Forum

• TAs and instructors will answer
• Collaborate with your fellow students
So what are neural
networks??

Voice Image
[Link] Transcription [Link] Text caption
signal

Game
[Link] Next move
State

• What are these boxes?

So what are neural
networks??

• It begins with this..

Early Models of Human
Cognition

• Associationism
– Humans learn through association
• 400BC-1900AD: Plato, David Hume, Ivan Pavlov..
Early Models of Human
Cognition

• Associationism
– Humans learn through association
• 400BC-1900AD: Plato, David Hume, Ivan Pavlov..
What are “Associations”

• Lightning is generally followed by thunder

– Ergo – “hey here’s a bolt of lightning, we’re going to hear
thunder”
– Ergo – “We just heard thunder; did someone get hit by
lightning”?

• Association!
Observation: The Brain

• Mid 1800s: The brain is a mass of

interconnected neurons
Brain: Interconnected
Neurons

• Many neurons connect in to each neuron

• Each neuron connects out to many neurons
Connectionist Machines

• Network of processing elements

• All world knowledge is stored in the connections
between the elements
Connectionist Machines

• Neural networks are connectionist machines

– As opposed to Von Neumann Machines

Von Neumann/Princeton Machine Neural Network

PROGRAM
PROCESSOR NETWORK
DATA

Processing Memory
unit

• The machine has many non-linear processing units

– The program is the connections between these units
• Connections may also define memory
Modelling the brain
• What are the units?
• A neuron: Soma

Dendrites
Axon

• Signals come in through the dendrites into the Soma

• A signal goes out via the axon to other neurons
– Only one axon per neuron
• Factoid that may only interest me: Neurons do not undergo cell
division
– Neurogenesis occurs from neuronal stem cells, and is minimal after
birth
Rosenblatt’s perceptron

• Original perceptron model

– Groups of sensors (S) on retina combine onto cells in association
area A1
– Groups of A1 cells combine into Association cells A2
– Signals from A2 cells combine into response cells R
– All connections may be excitatory or inhibitory
Simplified mathematical
model of Perceptron

• Number of inputs combine linearly

– Threshold logic: Fire if combined input exceeds
or equal to threshold

>=
His “Simple” Perceptron
• Originally assumed could represent any Boolean circuit and
perform any logic
– “the embryo of an electronic computer that [the Navy] expects
will be able to walk, talk, see, write, reproduce itself and be
conscious of its existence,” New York Times (8 July) 1958
– “Frankenstein Monster Designed by Navy That Thinks,” Tulsa,
Oklahoma Times 1958
Also provided a learning
algorithm

Sequential Learning:
is the desired output in response to input
is the actual output in response to

• Boolean tasks
• Update the weights whenever the perceptron
output is wrong
• Proved convergence for linearly separable classes
Perceptron
X 1
-1
2 X 0
1

Y
X 1

1
1

Y Values shown on edges are weights,

numbers in the circles are thresholds

• Easily shown to mimic any Boolean gate

• But…
Perceptron

No solution for XOR!

Not universal!
X ?

?
?

• Minsky and Papert, 1968

A single neuron is not
enough

• Individual elements are weak computational elements

– Marvin Minsky and Seymour Papert, 1969, Perceptrons:
An Introduction to Computational Geometry

• Networked elements are required

Multi-layer Perceptron!
X 1

1
-1 1

2
1
1

-1
-1

Y
Hidden Layer
• XOR
– The first layer is a “hidden” layer
– Also originally suggested by Minsky and Papert 1968
A more generic model

2
1 1
0 1
1 -1 1 1

2 2 1 2
1 1 1 -1 1 -1
1 1
1
X Y Z A

• A “multi-layer” perceptron
• Can compose arbitrarily complicated Boolean functions!
– In cognitive terms: Can compute arbitrary Boolean functions over
sensory input
– More on this in the next class
But our brain is not
Boolean

• We have real inputs

• We make non-Boolean inferences/predictions
The perceptron with real
inputs
x1
x2

• x1…xN are realvalued

• w1…wN are realvalued
• Unit “fires” if weighted input exceeds a threshold
The perceptron with real
inputs and a real output
b
x1
x2
x3
i i
sigmoid i

• x1…xN are realvalued

• w1…wN are realvalued
• The output y can also be real valued
– Sometimes viewed as the “probability” of firing
The “real” valued
perceptron
b
x1
x2
f(sum)
x3

• Any real-valued “activation” function may operate on the weighted-

sum input
– We will see several later
– Output will be real valued
• The perceptron maps real-valued inputs to real-valued outputs
• Is useful to continue assuming Boolean outputs though, for interpretation
A Perceptron on Reals

x3
1
x2 w1 x1 + w2 x2 = T

xN
0
x1
i i
i

• A perceptron operates on x2
x1
real-valued vectors
– This is a linear classifier
Boolean functions with a
real perceptron

0,1 1,1 0,1 1,1 0,1 1,1

X X Y

0,0 Y 1,0 0,0 Y 1,0 0,0 X 1,0

• Boolean perceptrons are also linear classifiers

– Purple regions have output 1 in the figures
– What are these functions
– Why can we not compose an XOR?
Composing complicated
“decision” boundaries

x2 Can now be composed into

“networks” to compute arbitrary
classification “boundaries”