0% found this document useful (0 votes)

8 views45 pages

Deep Learning Basics: Weights & Biases

Unit 1 introduces deep learning concepts including weights, biases, neurons, activation functions, and the training process of neural networks. It explains how these components work together to minimize loss through techniques like backpropagation and gradient descent. Real-life analogies are provided to illustrate these concepts, making them more relatable and easier to understand.

Uploaded by

tejashirurkar78

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views45 pages

Deep Learning Basics: Weights & Biases

Uploaded by

tejashirurkar78

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

Unit 1:

Introduction to
Deep Learning
Basic Terminology in Deep Learning: Weights, Biases, Neurons,
Activation Functions,

Training a Neural Network, Forward Pass, Loss Functions (MSE,

Cross-Entropy) Backpropagation and Gradient Descent
Weights
• Numerical parameters in a neural network.
• Represent the strength/importance of input
features.
• Each input is multiplied by a weight before being
passed to the neuron.

Equatio
n:
z=w1.x1+w2.x2+...+wn.xn
+b
Why are Weights
Important?
• Decide how much influence each input has on the
output.
• Adjusted during training using backpropagation +
gradient descent.
• Perfectly tuned weights → accurate predictions.
Real-Life Analogy of
Weights
Cooking Recipe
Analogy
• Inputs = ingredients (flour, sugar, salt, oil).
• Weights = how much of each ingredient you add.
• Bias = the “extra spice” you always add no matter
what.
• Output = the taste of the dish.
• Training = trying, tasting, and adjusting ingredient
Case Study Analogy of weights:
Online Movie Recommendation
• Inputs = user’s past viewing history (action, comedy,
drama, thriller).
• Weights = how much importance the system gives to each
genre.
i. Action → +0.8
ii. Drama → +0.6
iii. Comedy → +0.2
iv. Thriller → -0.4 (user dislikes thrillers, so negative weight).
• Bias = general trend (e.g., festive season → push family
movies).
• Output = recommended movie list.
Bias
• Bias is a trainable parameter in a neural network.
• It is added to the weighted sum of inputs before
applying the activation function.
• Mathematical form:
z=(w1x1+w2x2+...
+wnxn)+b
where b= bias.
Purpose of
Bias
• Allows the model to fit data better, even when inputs are
zero.
• Prevents the network from being overly dependent on just
weights.
Example
• Without bias: neuron always passes through origin (0,0).
• With bias: neuron can shift and better match real-world
data.
Case Study Analogy of
Bias
House Price
Prediction
• Inputs = square footage, number of rooms, location score.
• Weights = importance of each input.
• Bias = base price of the house (even an empty plot has
some cost).
• Without bias → house with 0 sq. ft. and 0 rooms = price = 0
(not realistic).
• With bias → model accounts for base land value, permits,
Neuron
•
s
A neuron is the basic unit of a neural network.
• It takes multiple inputs, applies weights, adds a bias, passes
the result through an activation function, and produces an
output.
Mathematical
Representation:
y=f(w1x1+w2x2+...+wn
xn+b)
where:
xi= inputs
wi= weights
b = bias
f = activation function
y = output
Purpose of Neuron:
• Captures patterns and relationships in data.
• Multiple neurons together form layers, which
combine to create deep networks.
Analogy (Real-Life Example) of
Neurons
Light Bulb
Analogy

• Inputs = switches connected to the bulb.

• Weights = thickness/quality of the wires (how much current
passes).
• Bias = small backup battery to ensure minimum current.
• Activation function = ON/OFF threshold of the bulb (does it
glow or not?).
• Output = brightness of the bulb.
Case Study Analogy of
Neurons
Email Spam
Detection

• Inputs = words in the email (e.g., “discount”, “lottery”,

“urgent”).
• Weights = importance of each word (e.g., “lottery” → +0.9,
“hello” → +0.1).
• Bias = base tendency (system might flag suspicious emails
even if few spam words are missing).
• Activation function = decision boundary (spam or not spam).
Activation
Function
• An activation function decides whether a neuron should “fire”
or not.
• It introduces non-linearity into the model.
• Without activation functions, neural networks would just be
linear models, no matter how many layers you stack.

Role in Neural Networks

• Maps the weighted sum of inputs into an output.
• Helps the network learn complex patterns (curves, images,
language, etc.).
• Controls the range of output values (e.g., 0–1, -1 to +1).
Types of Activation
Functions
[Link] Function: Outputs 0 or 1 based on threshold; basic perceptron
activation.
[Link] (Logistic): Smoothly maps inputs to range (0, 1); good for
probabilities.
[Link] (Hyperbolic Tangent): Maps inputs to (-1, 1); zero-centered
activation.
[Link] (Rectified Linear Unit): Outputs max(0, x); widely used for
hidden layers.
[Link] ReLU: Allows a small negative slope for x < 0 to avoid dead
neurons.
Click on Link - Activation
Function

Lin
k
Training a Neural
Network
Training a Neural Network is the process of teaching a neural
network to learn patterns in data by adjusting its weights and
biases using a training dataset. The goal is to minimize the error
(loss) between the predicted output and the actual output.
Key Components of
Training

[Link] Data: Features used by the network (e.g., pixel values for an
image).
[Link] & Biases: Parameters that are adjusted during training.
[Link] Functions: Introduce non-linearity and allow the network to
model complex relationships.
[Link] Function: Measures how far the predictions are from actual outputs
(e.g., MSE, Cross-Entropy).
[Link]: Algorithm to adjust weights/biases (e.g., Gradient Descent).
[Link] Rate: Step size for updating weights.
Step-by-Step Process of
Training
Step 1: Forward
Pass
• Inputs are fed through the network.
• Each neuron computes a weighted sum +
bias.
• Activation function is applied to generate
output.
• The network produces predictions.
Step 2: Compute
Loss
• Compare predictions with actual outputs using a loss
function.
• Examples:
[Link] (Mean Squared Error) for regression
[Link]-Entropy Loss for classification
Step 3:
Backpropagation
• Compute gradients of loss w.r.t weights and biases using
chain rule.
• Determines how much each weight contributed to the
error.
Step 4: Weight & Bias
Update
Use an optimizer (e.g., Gradient Descent) to update weights
and biases:
wnew=wold−η.
(∂L/∂w)
η = learning rate, L
= loss
Step 5:
Repeat
Repeat forward pass → loss → backpropagation → weight update
for many epochs until loss is minimized.
Easy-to-Understand Analogy of Training a Neural
Network
Analogy: Learning to Bake a
Cake

• Input ingredients: Flour, sugar, eggs = Input features

• Recipe instructions: Network structure & activation functions
• Taste test: Loss function measures how good the cake is
• Adjust ingredients: Backpropagation adjusts flour/sugar/eggs =
updating weights
• Keep baking: Repeat until cake tastes perfect = Training for many
epochs
Forward
Pass

The Forward Pass is the process in a neural network where input

data is passed through each layer, each neuron computes its
output using weights, biases, and activation functions, and
finally the network produces a predicted output.
It is the first step in training
Step-by-Step Process of Forward
Pass

[Link] Layer: Receives raw input features (e.g., pixel values,

numerical data).
[Link] Layers Computation:
n
Each neuron computes the weighted sum of inputs + bias:
z=∑wixi
i
+b
Apply the activation function to introduce non-
linearity:
a=f(
z)
3. Output Layer:
• Receives outputs from the last hidden layer
• Computes weighted sum + bias
• Applies activation (e.g., softmax for classification, linear for
regression)
• Produces the final prediction
Easy-to-Understand Analogy of
Forward pass
Analogy: Assembly Line
Factory
[Link] materials (input data) enter the assembly line (input layer).

[Link] worker (neuron) processes materials based on instructions

(weights and biases).

[Link] (activation functions) adjust the output for each step.
[Link] product (predicted output) comes out at the end of the line (output
layer).
Loss Functions (MSE, Cross-
Entropy)

Loss Function is a mathematical function that measures how far

the predicted output of a neural network is from the actual
target.
• It is also called cost function or objective function.
• The goal of training is to minimize the loss by adjusting
weights and biases.
Types of Loss
Functions
1. Mean Squared Error
(MSE)
Definition: Measures the average squared difference between
predicted and actual values. Commonly used for regression
problems.

Formula:
• Easy Analogy of MSE: Guessing weight of apples

If you guess 100g and actual is 120g, the squared

difference = 400.
Average all guesses to see overall error.
• Range: 0 → ∞ (0 means perfect
prediction)
• Real-Life Applications:
[Link] house prices
[Link] stock prices
[Link] prediction
[Link] crop yield
[Link]-based medical measurements

• When to Use: Use MSE for regression problems where output

is continuous.
2. Cross-Entropy Loss (Log
Loss)
• Definition: Measures the difference between predicted
probability distribution and actual distribution. Commonly
used for classification problems

• Formula (Binary
Classification):
• Easy Analogy: Choosing the correct door in a game show
[Link] you predict 90% chance for correct door, but choose wrong
door, penalty is high.
[Link] network to assign high probability to correct
answer.
• Range: 0 → ∞ (0 means perfect
prediction)
• Real-Life Applications of Cross-Entropy
Loss:
[Link] classification (cats vs dogs vs other
animals)
[Link] digit recognition (MNIST)
[Link] detection (email spam/not spam)
[Link] diagnosis (multi-class medical
conditions)
• [Link]
When to Useanalysis (positive,
Cross-Entropy negative,
Loss: Use Cross-Entropy for
neutral)
classification problems where outputs are probabilities.
Short Summary of Loss
Functions
[Link] → regression, continuous outputs, penalizes large errors
heavily.
[Link]-Entropy → classification, probabilistic outputs,
penalizes wrong predictions logarithmically.
[Link] are essential for gradient-based learning, because the
optimizer uses loss gradients to update weights.
Backpropagati
on
Backpropagation (short for backward propagation of
errors) is the process of updating the weights and biases
of a neural network by calculating the gradient of the loss
function with respect to each parameter.
• It allows the network to learn from its mistakes.
• Works in conjunction with gradient descent to
minimize loss.
Step-by-Step Process of
Backpropagation
[Link] Pass: Compute the predicted output for the input
data.
[Link] Loss: Compare prediction with actual output using
a loss function.
[Link] Pass (Backpropagation):
• Compute gradient of loss w.r.t each weight and bias using the
chain rule.
• Determines how much each parameter contributed to the
erro
4. Weight & Bias Update:
5. Repeat: Perform for all training examples
(batch/mini-batch/epoch) until loss is minimized.
Easy-to-Understand Analogy for
Backpropagation
Analogy: Learning to Shoot
Arrows
[Link], you shoot an arrow (forward pass).
[Link] see where it lands compared to the target (loss
computation).
[Link] calculate how much to adjust your aim and angle
(backpropagation).
[Link] and shoot again (weight update).
[Link] until you hit the target consistently.
Case Study / Real-Life Example of
Backpropagation
Handwritten Digit Recognition (MNIST
Dataset):
[Link]: 28x28 pixel image of a digit.
[Link] Pass: Compute predicted probabilities using the
current weights.
[Link] Computation: Cross-Entropy loss between predicted
probabilities and actual label.
[Link]: Compute gradients of loss w.r.t all weights
and biases.
[Link] Update: Adjust weights to reduce error.
[Link]: For all images over multiple epochs until high
accuracy is achieved
Summary of
Backpropagation
• Backpropagation is the core learning mechanism in neural
networks.
• Uses the chain rule to efficiently calculate gradients.
• Works hand-in-hand with gradient descent.
• Without backpropagation, the network cannot learn from its
mistakes.
Gradient
Descent
Gradient Descent is an optimization algorithm used to minimize
the loss function of a neural network by iteratively updating the
weights and biases in the direction of the steepest decrease of
the loss.
• It is the core method for training neural networks.
• Helps the network learn optimal parameters.
Step-by-Step Process of Gradient
Descent
[Link] Weights & Biases: Usually randomly.
[Link] Pass: Compute predicted output.
[Link] Loss: Measure difference between prediction and
actual output.
[Link] Gradients: Using backpropagation, calculate

5. Update Parameters: Move in the opposite direction of

the gradient:

6. Repeat: Continue over all training examples for multiple

epochs until loss is minimized.
Easy-to-Understand Analogy of Gradient
Descent
Analogy: Hiking Down a Mountain to Find the
Lowest Point
• You are on a mountain (loss function graph).
• Your goal is to reach the valley (minimum loss).
• You look at the slope (gradient) and take a step downhill
(update weights).
• Repeat until you reach the bottom (minimum loss).
Variants of Gradient
Descent

[Link] Gradient Descent: Uses the entire dataset to compute

gradients.
[Link] Gradient Descent (SGD): Updates weights after
each training sample.
[Link]-batch Gradient Descent: Uses a small batch of samples
for each update (combines benefits of batch & SGD).
Real-Life Applications of Gradient
Descent
[Link] neural networks for image recognition
[Link] stock prices (regression)
[Link] language processing tasks (text classification,
sentiment analysis)
[Link] systems
[Link] and control systems
When to Use Gradient
Descent ?

[Link] used for training neural networks to minimize loss.

[Link] of learning rate and batch size depends on dataset
size and network complexity.
[Link] or mini-batch is preferred for large datasets due to
efficiency.
Summary of Gradient
Descent
• Gradient Descent is like learning by trial and error.
• Learning rate controls step size; too high → may overshoot,
too low → slow convergence.
• Works hand-in-hand with Backpropagation.

Fundamental 1
No ratings yet
Fundamental 1
15 pages
Understanding Neural Networks Basics
No ratings yet
Understanding Neural Networks Basics
4 pages
Deep Learning Concepts and Notations
No ratings yet
Deep Learning Concepts and Notations
3 pages
What Is Neural Network-1
No ratings yet
What Is Neural Network-1
7 pages
Deep Learning Activation Functions Explained
No ratings yet
Deep Learning Activation Functions Explained
4 pages
Deep Learning: Concepts and Applications
No ratings yet
Deep Learning: Concepts and Applications
80 pages
Deep Learning Ecosystem
No ratings yet
Deep Learning Ecosystem
73 pages
Neural Networks Explained for Class 12
100% (1)
Neural Networks Explained for Class 12
11 pages
Deep Learning: Neural Networks Basics
No ratings yet
Deep Learning: Neural Networks Basics
62 pages
Introduction to Neural Networks Basics
No ratings yet
Introduction to Neural Networks Basics
25 pages
Understanding Artificial Neural Networks
No ratings yet
Understanding Artificial Neural Networks
23 pages
Introduction to Neural Networks Basics
No ratings yet
Introduction to Neural Networks Basics
28 pages
Understanding Deep Learning Basics
No ratings yet
Understanding Deep Learning Basics
68 pages
Understanding Neural Networks Basics
No ratings yet
Understanding Neural Networks Basics
9 pages
Unit I
No ratings yet
Unit I
49 pages
Understanding Multilayer Perceptrons
No ratings yet
Understanding Multilayer Perceptrons
23 pages
Understanding Artificial Neural Networks
No ratings yet
Understanding Artificial Neural Networks
37 pages
Feedforward Neural Network
No ratings yet
Feedforward Neural Network
16 pages
Deep Learning Fundamentals Explained
No ratings yet
Deep Learning Fundamentals Explained
41 pages
DL Unit 2
No ratings yet
DL Unit 2
56 pages
Understanding Neural Networks and Backpropagation
No ratings yet
Understanding Neural Networks and Backpropagation
16 pages
Understanding Neural Networks Basics
No ratings yet
Understanding Neural Networks Basics
13 pages
Understanding Artificial Neural Networks
No ratings yet
Understanding Artificial Neural Networks
19 pages
Understanding Neural Networks Basics
No ratings yet
Understanding Neural Networks Basics
5 pages
Deep Neural Networks in Research Methodology
No ratings yet
Deep Neural Networks in Research Methodology
7 pages
Deep Learning Fundamentals Explained
No ratings yet
Deep Learning Fundamentals Explained
50 pages
ML Unit 3 Notes
No ratings yet
ML Unit 3 Notes
49 pages
Key Components of Neural Networks Explained
No ratings yet
Key Components of Neural Networks Explained
8 pages
Big Data Analysis: Neural Networks
No ratings yet
Big Data Analysis: Neural Networks
61 pages
Overview of Artificial Neural Networks
No ratings yet
Overview of Artificial Neural Networks
7 pages
Understanding Perceptrons and MLPs
No ratings yet
Understanding Perceptrons and MLPs
14 pages
Lec2 Neural Network
No ratings yet
Lec2 Neural Network
90 pages
Neural Networks in Drug Discovery
No ratings yet
Neural Networks in Drug Discovery
28 pages
Chapter # 2
No ratings yet
Chapter # 2
127 pages
Introduction to Neural Networks and Activation Functions
No ratings yet
Introduction to Neural Networks and Activation Functions
11 pages
Understanding Perceptrons in Deep Learning
No ratings yet
Understanding Perceptrons in Deep Learning
62 pages
AAI Notes For Unit 1
No ratings yet
AAI Notes For Unit 1
17 pages
Historical Trends in Deep Learning
No ratings yet
Historical Trends in Deep Learning
6 pages
Understanding Neural Networks Basics
No ratings yet
Understanding Neural Networks Basics
3 pages
Introduction to Classification Models
No ratings yet
Introduction to Classification Models
14 pages
ML - Unit-2 Machine Learning
No ratings yet
ML - Unit-2 Machine Learning
12 pages
Understanding Neural Networks Basics
No ratings yet
Understanding Neural Networks Basics
16 pages
Understanding Neural Networks Basics
No ratings yet
Understanding Neural Networks Basics
22 pages
Activation Functions for Multi-Class Output
No ratings yet
Activation Functions for Multi-Class Output
21 pages
Introduction To Neural Networks - NCU - 1 (2)
No ratings yet
Introduction To Neural Networks - NCU - 1 (2)
22 pages
Deep Learning Neural Networks Overview
No ratings yet
Deep Learning Neural Networks Overview
28 pages
Deep Learning for Stock Prediction & Security
No ratings yet
Deep Learning for Stock Prediction & Security
21 pages
Unit 4 (MLT)
No ratings yet
Unit 4 (MLT)
21 pages
Deep Learning Study Notes for AI Course
No ratings yet
Deep Learning Study Notes for AI Course
60 pages
Module 2 - Neural Network and Deep Learning Framework
No ratings yet
Module 2 - Neural Network and Deep Learning Framework
58 pages
Complete Guide to Neural Networks
50% (2)
Complete Guide to Neural Networks
23 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
17 pages
Introduction to Neural Networks Basics
No ratings yet
Introduction to Neural Networks Basics
48 pages
Deep Learning Fundamentals Overview
No ratings yet
Deep Learning Fundamentals Overview
90 pages
Understanding IBM Neural Networks
No ratings yet
Understanding IBM Neural Networks
10 pages
Understanding Neural Networks Basics
No ratings yet
Understanding Neural Networks Basics
15 pages
Understanding Artificial Neural Networks
No ratings yet
Understanding Artificial Neural Networks
6 pages
Intro to Hadoop and MapReduce
No ratings yet
Intro to Hadoop and MapReduce
8 pages
Understanding Access Control Systems
No ratings yet
Understanding Access Control Systems
9 pages
Essential ETL Techniques for Data Warehousing
No ratings yet
Essential ETL Techniques for Data Warehousing
9 pages
Security and Compliance Essentials Guide
No ratings yet
Security and Compliance Essentials Guide
4 pages
SQL vs NoSQL: Key Differences Explained
No ratings yet
SQL vs NoSQL: Key Differences Explained
12 pages
Understanding Access Control Systems
No ratings yet
Understanding Access Control Systems
7 pages
Understanding Neural Network Layers
No ratings yet
Understanding Neural Network Layers
2 pages
COEP Algorithm Analysis and Design Syllabus
No ratings yet
COEP Algorithm Analysis and Design Syllabus
22 pages
Deep Learning: Overview and Applications
No ratings yet
Deep Learning: Overview and Applications
16 pages
Build Neural Network for MNIST Dataset
No ratings yet
Build Neural Network for MNIST Dataset
4 pages
Introduction to Deep Learning Concepts
No ratings yet
Introduction to Deep Learning Concepts
10 pages
Experiencing MIS Test Bank Overview
No ratings yet
Experiencing MIS Test Bank Overview
20 pages
Unit 5 A1 Beginner Workbook Answers
No ratings yet
Unit 5 A1 Beginner Workbook Answers
1 page
Financial Analyst Profile: Moris Kiringa
No ratings yet
Financial Analyst Profile: Moris Kiringa
3 pages
Algorithms and Data Structures Quiz
No ratings yet
Algorithms and Data Structures Quiz
6 pages
International Aviation Law - 5
No ratings yet
International Aviation Law - 5
40 pages
nRF54L15 DK Hardware User Guide
No ratings yet
nRF54L15 DK Hardware User Guide
30 pages
Bitel Pucallpa: Precios de Móviles
No ratings yet
Bitel Pucallpa: Precios de Móviles
5 pages
Data Mining Techniques Overview
No ratings yet
Data Mining Techniques Overview
29 pages
VU Quiz Collection for CS403 & CS302
No ratings yet
VU Quiz Collection for CS403 & CS302
5 pages
Supplier Invoice Workflow Management
No ratings yet
Supplier Invoice Workflow Management
25 pages
MEK-7222 Automated Hematology Analyzer Manual
No ratings yet
MEK-7222 Automated Hematology Analyzer Manual
185 pages
Dropbox System Crash Reports Analysis
No ratings yet
Dropbox System Crash Reports Analysis
3 pages
Exunys Aimbot Script for Roblox
No ratings yet
Exunys Aimbot Script for Roblox
9 pages
WESTRACE MKII Installation Checklist
No ratings yet
WESTRACE MKII Installation Checklist
25 pages
JavaScript Closure Library Code
No ratings yet
JavaScript Closure Library Code
43 pages
PC5015 Control Panel Specifications
No ratings yet
PC5015 Control Panel Specifications
44 pages
IBM Sterling Agent Architecture Overview
No ratings yet
IBM Sterling Agent Architecture Overview
10 pages
Oracle R12 Subledger Accounting Overview
100% (1)
Oracle R12 Subledger Accounting Overview
17 pages
Data Preprocessing in Data Mining
No ratings yet
Data Preprocessing in Data Mining
62 pages
Data Storage Exam Questions and Answers
No ratings yet
Data Storage Exam Questions and Answers
16 pages
Emergency Stop Safety Modules: Model ES-FA-9AA (24V Ac/dc, 3 N/O)
100% (1)
Emergency Stop Safety Modules: Model ES-FA-9AA (24V Ac/dc, 3 N/O)
12 pages
Narcissism and Self-Esteem on Facebook
No ratings yet
Narcissism and Self-Esteem on Facebook
8 pages
Siemens 6ES71416BG000AB0 Data Sheet
No ratings yet
Siemens 6ES71416BG000AB0 Data Sheet
3 pages
AbTrack: Personal Fitness Goal Tracker
No ratings yet
AbTrack: Personal Fitness Goal Tracker
93 pages
Gent Loop Powered Interface Overview
No ratings yet
Gent Loop Powered Interface Overview
2 pages
Sprint iQ 4S Leak Tester Overview
No ratings yet
Sprint iQ 4S Leak Tester Overview
2 pages
JE-990 Discrete Operational Amplifier
No ratings yet
JE-990 Discrete Operational Amplifier
9 pages
Non-Teaching Assignment Days Explained
No ratings yet
Non-Teaching Assignment Days Explained
6 pages
Understanding Process States in OS
No ratings yet
Understanding Process States in OS
18 pages
iDirect X3 Modem Setup Guide
No ratings yet
iDirect X3 Modem Setup Guide
35 pages

Deep Learning Basics: Weights & Biases

Uploaded by

Deep Learning Basics: Weights & Biases

Uploaded by

Unit 1:

Training a Neural Network, Forward Pass, Loss Functions (MSE,

• Inputs = switches connected to the bulb.

• Inputs = words in the email (e.g., “discount”, “lottery”,

Role in Neural Networks

• Input ingredients: Flour, sugar, eggs = Input features

The Forward Pass is the process in a neural network where input

[Link] Layer: Receives raw input features (e.g., pixel values,

[Link] worker (neuron) processes materials based on instructions

(weights and biases).

Loss Function is a mathematical function that measures how far

If you guess 100g and actual is 120g, the squared

• When to Use: Use MSE for regression problems where output

5. Update Parameters: Move in the opposite direction of

6. Repeat: Continue over all training examples for multiple

[Link] Gradient Descent: Uses the entire dataset to compute

[Link] used for training neural networks to minimize loss.

You might also like