0% found this document useful (0 votes)

30 views166 pages

Unit 3 AI&ML

The document covers supervised learning in machine learning, focusing on linear regression models, classification techniques, and various algorithms such as logistic regression, support vector machines, and decision trees. It explains key concepts including data, model training, evaluation, and optimization methods like gradient descent. Additionally, it discusses evaluation metrics for model performance and introduces Bayesian linear regression and random forests.

Uploaded by

23l138

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

30 views166 pages

Unit 3 AI&ML

Uploaded by

23l138

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

CS3491

ARTIFICIAL INTELLIGENCE
AND
MACHINE LEARNING
SUPERVISED LEARNING Unit - 3
SUPERVISED LEARNING

Introduction to machine learning – Linear Regression Models:

Least squares, single & multiple variables, Bayesian linear
regression, gradient descent, Linear Classification Models:
Discriminant function – Probabilistic discriminative model -
Logistic regression, Probabilistic generative model – Naive
Bayes, Maximum margin classifier – Support vector machine,
Decision Tree, Random forests
[Link]

Machine Learning
Machine Learning is the process of training a model, to make useful
predictions or generate content from data.
[Link]

Machine Learning
ML - core concepts
● Data

● Model

● Training

● Evaluating

● Inference
● Data
Store related data in datasets
Datasets are made up of individual examples that contain
features and a label.
A dataset is characterized by its size and diversity.
● Model
A model is the complex collection of numbers that define the
mathematical relationship from specific input feature patterns to
specific output label values. The model discovers these patterns
through training.

● Training
Before a supervised model can make predictions, it must be trained.
To train a model, we give the model a dataset with labeled
examples.
● Training
An ML model making a prediction from a labeled example.

An ML model updating its predicted value

● Training

An ML model updating its predictions for each labeled example in the training
dataset.
● Evaluate

Evaluating an ML model by
comparing its predictions to the
actual values.
[Link]

Supervised Learning
Supervised learning is a type of machine learning where a model learns from labeled
data, meaning the input data comes with corresponding correct output or "label" values.
Supervised Learning
Regression

A regression model predicts a numeric value.

Classiﬁcation

[Link]
Linear Regression
Linear regression is a statistical technique used to ﬁnd the relationship
between variables. In an ML context, linear regression ﬁnds the relationship
between features and a label.
Linear Regression
Linear Regression

creating model by drawing a best ﬁt line through the points

Linear regression equation
❏ In this example, calculate the weight and bias from the line drew.

❏ The bias is 30 (where the line intersects the y-axis), and the weight is -3.6

(the slope of the line). The model would be deﬁned as 𝑦′=30+(−3.6)(𝑥1)

❏ , and use it to make predictions. For instance, using this model, a

4,000-pound car would have a predicted fuel eﬃciency of 15.6 miles per

gallon.
Models with multiple features

For example, a model that predicts gas mileage could additionally use features
such as the following:

● Engine displacement
● Acceleration
● Number of cylinders
● Horsepower
Mean Squared Error (MSE):

Summing up all the squared The actual (true) value for the ith data point
errors
The squared error — squaring
ensures all errors are positive

The error (residual) —

Total number of data points
how far off the
prediction is
The predicted value by the model for the
ith data point
Mean Squared Error (MSE):
Example
Actual: y=[3,5,2]
Predicted: y^=[2.5,5.3,1.7]

1. (3−2.5)2=0.25
2. (5−5.3)2=0.09
3. (2−1.7)2=0.09

MSE=⅓ (0.25+0.09+0.09) = 0.43 / 3 ≈0.143

Least Square Method
Least Squares method is a statistical technique used to ﬁnd the
equation of best-ﬁtting curve or line to a set of data points by minimizing
the sum of the squared differences between the observed values and the
values predicted by the model.

Regression line / line of best ﬁt

Slope

Intercept
Example
The table below provides the monthly average petrol prices from April (Month 4) to
September (Month 9).

a. Using linear regression, calculate the best-fit line for the given data.
b. Predict the petrol price in December (Month 12).
c. Interpret the goodness of the regression line using R2
c. Interpret the goodness of the regression line using R2

R² measures how well the linear regression line fits the data by quantifying the
proportion of variance in the dependent variable (petrol price) explained by the
independent variable (month).

SST ≈ 42.6667
SSR ≈ 4.3333
R² = 1 - (4.3333/42.6667) ≈ 0.895.

This confirms the prior result of 0.895 (89.5%), indicating strong fit.
reference table for R2 calculation

Month (x) Actual (y) Predicted (y^\hat{y}y^) Residual (y - y^\hat{y}y^) Squared Residual

4 77 77.17 -0.17 0.03

5 78 78.34 -0.34 0.12

6 81 79.52 1.48 2.19

7 80 80.69 -0.69 0.48

8 82 81.86 0.14 0.02

9 83 83.04 -0.04 0.00

Total SSR ≈ 4.33
Bayesian linear regression
Bayesian regression is a probabilistic approach to regression where we treat model
parameters (like weights) as random variables, not fixed values.

Apply Bayes’ Theorem to update our beliefs about the parameters after seeing the
data.
Bayesian linear regression
Model Assumption (Like Ordinary Linear Regression)

Prior over Weights w

Likelihood Function

Posterior over Weights

Making Predictions
Bayesian linear regression
Gradient Based Optimization
The goal of optimization in machine learning is to find the parameters of the
model that minimize (or maximize) a loss function. The loss function quantifies
how well the model's predictions match the actual data.
Gradient descent is a widely used optimization algorithm for minimizing the
loss function. It involves the following steps:
1. Compute Loss: At each step, calculate the loss.

2. Compute Gradient: Find the gradient (slope) of the loss function with
respect to model parameters.

3. Update Parameters: Adjust parameters in the opposite direction of the

gradient.
SIMPLE LINEAR
REGRESSION
Example 1
Example 2

[Link]
Linear Classification Model
Classification
Classification is a fundamental machine learning technique aimed at organizing input data
into distinct classes.
Classification
Classification is a fundamental machine learning technique aimed at
organizing input data into distinct classes.

Types of Classiﬁcation

❖ Binary Classiﬁcation
Two distinct categories

❖ Multiclass Classiﬁcation
more than two categories
Characteristics of Classiﬁcation Models

❖ Class Separation

❖ Decision Boundaries

❖ Sensitivity to Data Quality

❖ Handling Imbalanced Data

Classification Algorithms
Linear Classifiers
Non - Linear Classifiers
Classification
Classification
Classification
Data points of 2 classes
Linear Line to separate the classes
How Mean and Variance plays a role?
Mean
Variance
Variance measures how much each feature spreads.

Correlation measures how features move together.

Predict the Class with argmax function
The argmax function is used to select the class with the highest score or
probability.
Predict the Class with
argmax function

Example
import numpy as np

# Example output probabilities from a model

probabilities = [Link]([0.1, 0.8, 0.1])

# Find the index of the maximum probability

predicted_class = [Link](probabilities)

print(predicted_class) # Output: 1
Discriminant Function

A discriminant function is a function used in pattern classifiers

to partition the feature space based on probabilities or
equivalent functions, helping to determine the class to which a
given input belongs
Discriminant Function
Discriminant Function
Discriminant Function
Discriminant Function
Discriminant Function
Linear Discriminant Analysis (LDA)

Modeling of distinctions between groups, effectively separating two

or more classes.

LDA operates by projecting features from a higher-dimensional

space into a lower-dimensional one.

In machine learning, LDA serves as a supervised learning algorithm

specifically designed for classification tasks, aiming to identify a
linear combination of features that optimally segregates classes
within a dataset.
Linear Discriminant Analysis (LDA)

LDA assumes that the data has a Gaussian distribution and that the
covariance matrices of the different classes are equal.

It also assumes that the data is linearly separable, meaning that a

linear decision boundary can accurately classify the different
classes.
Linear Discriminant Analysis (LDA)

Two criteria are used by LDA to create a new axis:

1. Maximize the distance between the means of the two
classes.
2. Minimize the variation within each class.
Linear Discriminant Analysis (LDA)

Two criteria are used by LDA to create a new axis:

1. Maximize the distance between the means of the two
classes.
2. Minimize the variation within each class.
Mathematical Intuition Behind LDA

1: Maximize the distance between class means

2: Minimize the variation within each class

Step - by - step

1. Project data onto a line

2. Class means after projection

Step - by - step

3. Variance (scatter) within each class after projection

Step - by - step

4. Fisher’s Criterion (Objective Function)

5. Optimal w
Probabilistic generative model
vs
Probabilistic discriminative model
Probabilistic generative model
vs
Probabilistic discriminative model
Probabilistic generative model
vs
Probabilistic discriminative model
Probabilistic generative model
vs
Probabilistic discriminative model
Probabilistic generative model
vs
Probabilistic discriminative model
Example 2
Problem in Linear Classification
Classes are not completely separated using linear equation since
linear classification inadequate

Logistic regression used to predict the probability of output being a

specific category based on the input feature,
Logistic Regression

❏ Logistic regression is a supervised machine learning algorithm used for classiﬁcation

tasks where the goal is to predict the probability that an instance belongs to a given
class or not.

❏ Logistic regression is used for binary classiﬁcation where we use sigmoid function, that
takes input as independent variables and produces a probability value between 0 and 1.

❏ For example, we have two classes Class 0 and Class 1 if the value of the logistic
function for an input is greater than 0.5 (threshold value) then it belongs to Class 1
otherwise it belongs to Class 0.
❏ In Logistic regression, instead of ﬁtting a regression line, we ﬁt an “S” shaped logistic
function, which predicts two maximum values (0 or 1).
Logistic Regression
Equation of Logistic Regression:
Logistic Regression
❏ Logistic regression is a supervised learning algorithm used to estimate the
probability that a given instance belongs to a particular class.
❏ Logistic regression outputs probabilities between 0 and 1. It achieves this by passing
the linear combination of input features through a sigmoid function, ensuring that
predictions remain within a valid probability range.

Types of Logistic Regression

❏ Binomial Logistic Regression: Applied when the dependent variable has
two outcomes (e.g., spam vs. non-spam).
❏ Multinomial Logistic Regression: Used when the target variable has more
than two categories without a natural order (e.g., product categories).
❏ Ordinal Logistic Regression: Applied when the dependent variable has
ordered categories (e.g., satisfaction levels: low, medium, high).
Steps in Logistic Regression

1. Sigmoid Function and Probability Estimation

sigmoid function transforms the linear regression output into a probability value between 0 and 1

2. Logistic Regression Equation

3. Cost Function

4. Gradient descent to optimize parameters

Linear vs Logistic Regression
Example
Classify whether a student gets admitted (1) or not admitted (0) to a college based on
their exam score.
Example
Step 1: Model Assumption

Step 2: Initialize Parameters

Step 3: Define Cost Function (Log Loss)

Example
Step 1: Model Assumption

Step 2: Initialize Parameters

Step 3: Define Cost Function (Log Loss)

Example
Step 4: Gradient Descent Optimization
Example
Step 5: After Training

Step 6: Make Predictions

Support Vector Machine (SVM) Algorithm

Support Vector Machine (SVM) is a supervised machine learning algorithm used for
classiﬁcation and regression tasks.

SVM aims to ﬁnd the optimal hyperplane in an N-dimensional space to separate

data points into different classes.

The algorithm maximizes the margin between the closest points of different classes.

Hyperplane

Support Vectors

Margin
Linear SVM

Non-Linear SVM
Decision Tree Classification Algorithm

❏ Internal nodes represent the features of a dataset.

❏ Branches represent the decision rules.
❏ Each leaf node represents the outcome

Decision Node Leaf Node

Decision Tree Classification Algorithm
○ Step-1: Begin the tree with the root node, says S, which contains the

complete dataset.

○ Step-2: Find the best attribute in the dataset using Attribute Selection

Measure (ASM).

○ Step-3: Divide the S into subsets that contains possible values for the best

attributes.

○ Step-4: Generate the decision tree node, which contains the best attribute.

○ Step-5: Recursively make new decision trees using the subsets of the

dataset created in step -3. Continue this process until a stage is reached

where you cannot further classify the nodes and called the ﬁnal node as a

leaf node.
Example
Random Forest Algorithm
Random Forest Algorithm
Evaluation Metrics
○ Accuracy
○ Logarithmic Loss
○ Area Under Curve (AUC)
○ Precision
○ Recall
○ F1 Score
○ Confusion Matrix
Accuracy
Accuracy is the proportion of correct predictions made by the model, out of all
predictions:
True Positive Rate

True Negative Rate

False Positive Rate

Precision Recall
Confusion Matrix
Decision Tree problems
❏ Selecting the Best Attribute

❏ Creating Tree Nodes

❏ Stopping Criteria

❏ Handling Missing Values

❏ Tree Pruning

Entropy Information Gain

Thank You

Supervised Learning: Linear Regression Models
No ratings yet
Supervised Learning: Linear Regression Models
129 pages
Linear Regression Models in Machine Learning
No ratings yet
Linear Regression Models in Machine Learning
129 pages
Supervised Learning in Machine Learning
No ratings yet
Supervised Learning in Machine Learning
79 pages
Module 2
No ratings yet
Module 2
121 pages
Supervised Learning in Machine Learning
No ratings yet
Supervised Learning in Machine Learning
34 pages
Supervised Learning: Linear Regression Guide
No ratings yet
Supervised Learning: Linear Regression Guide
31 pages
Linear Regression in AIML Explained
No ratings yet
Linear Regression in AIML Explained
9 pages
2nd Unit
No ratings yet
2nd Unit
126 pages
Supervised Machine Learning: Linear Models and Fundamentals
No ratings yet
Supervised Machine Learning: Linear Models and Fundamentals
49 pages
Supervised Learning in Machine Learning
No ratings yet
Supervised Learning in Machine Learning
121 pages
Regression and Logistic Models Explained
No ratings yet
Regression and Logistic Models Explained
46 pages
Supervised Learning in Machine Learning
No ratings yet
Supervised Learning in Machine Learning
20 pages
Introduction to Machine Learning Concepts
No ratings yet
Introduction to Machine Learning Concepts
33 pages
Supervised Learning: Linear Regression Guide
No ratings yet
Supervised Learning: Linear Regression Guide
9 pages
Supervised Learning in AI & ML
No ratings yet
Supervised Learning in AI & ML
33 pages
Linear Regression Techniques and Examples
No ratings yet
Linear Regression Techniques and Examples
89 pages
Cost Functions in Machine Learning
No ratings yet
Cost Functions in Machine Learning
11 pages
Predictive Analytics: Regression & Classification
No ratings yet
Predictive Analytics: Regression & Classification
46 pages
Understanding Machine Learning Basics
No ratings yet
Understanding Machine Learning Basics
12 pages
Supervised Learning: Regression & Classification
No ratings yet
Supervised Learning: Regression & Classification
30 pages
MLE and Bayesian Methods in Regression
No ratings yet
MLE and Bayesian Methods in Regression
89 pages
Statistical Decision Theory & Linear Regression
No ratings yet
Statistical Decision Theory & Linear Regression
16 pages
Linear Regression in Machine Learning
No ratings yet
Linear Regression in Machine Learning
13 pages
Machine Learning Techniques Overview
No ratings yet
Machine Learning Techniques Overview
18 pages
Linear Models in Machine Learning
No ratings yet
Linear Models in Machine Learning
86 pages
Supervised Learning in Machine Learning
No ratings yet
Supervised Learning in Machine Learning
19 pages
Machine Learning Linear Models
No ratings yet
Machine Learning Linear Models
40 pages
Unit 3
No ratings yet
Unit 3
103 pages
Machine Learning Primer Overview
No ratings yet
Machine Learning Primer Overview
122 pages
Understanding Linear Models in ML
No ratings yet
Understanding Linear Models in ML
60 pages
Chapter 2 - Supervised Learning
No ratings yet
Chapter 2 - Supervised Learning
76 pages
Linear Separability in Regression Models
No ratings yet
Linear Separability in Regression Models
32 pages
Supervised Learning and Regression Analysis
No ratings yet
Supervised Learning and Regression Analysis
20 pages
Supervised Learning in Machine Learning
No ratings yet
Supervised Learning in Machine Learning
12 pages
Advanced Linear Regression Techniques
No ratings yet
Advanced Linear Regression Techniques
66 pages
Supervised Machine Learning Techniques Guide
No ratings yet
Supervised Machine Learning Techniques Guide
131 pages
Supervised Learning: Linear Regression Guide
No ratings yet
Supervised Learning: Linear Regression Guide
34 pages
Understanding Supervised Machine Learning
No ratings yet
Understanding Supervised Machine Learning
44 pages
ML Unit-2
No ratings yet
ML Unit-2
138 pages
Detecting Outliers in Machine Learning
No ratings yet
Detecting Outliers in Machine Learning
21 pages
Machine Learning: Regression & SVM Techniques
No ratings yet
Machine Learning: Regression & SVM Techniques
105 pages
Unit 1
No ratings yet
Unit 1
82 pages
Supervised Learning Unit 3notes
No ratings yet
Supervised Learning Unit 3notes
21 pages
Ai Week9
No ratings yet
Ai Week9
40 pages
Regression Analysis Techniques Overview
No ratings yet
Regression Analysis Techniques Overview
28 pages
Supervised Learning: Regression Explained
No ratings yet
Supervised Learning: Regression Explained
155 pages
Machine Learning Training Models Guide
No ratings yet
Machine Learning Training Models Guide
94 pages
Understanding Machine Learning Basics
No ratings yet
Understanding Machine Learning Basics
48 pages
Data Mining Techniques Overview
No ratings yet
Data Mining Techniques Overview
129 pages
Supervised Learning Overview
No ratings yet
Supervised Learning Overview
42 pages
Introduction to Machine Learning Concepts
No ratings yet
Introduction to Machine Learning Concepts
15 pages
Unit 3
No ratings yet
Unit 3
28 pages
Machine Learning Regression Overview
No ratings yet
Machine Learning Regression Overview
24 pages
Understanding Data Distributions
No ratings yet
Understanding Data Distributions
4 pages
Blockchain-Enhanced Stock Forecasting
No ratings yet
Blockchain-Enhanced Stock Forecasting
19 pages
MLP Implementation Lab Assignment
No ratings yet
MLP Implementation Lab Assignment
2 pages
Optimal Stopping Time for Gold Trading
No ratings yet
Optimal Stopping Time for Gold Trading
12 pages
Machine Learning for ABO3 Perovskites
No ratings yet
Machine Learning for ABO3 Perovskites
9 pages
AI-Driven NLP with Transformers
No ratings yet
AI-Driven NLP with Transformers
3 pages
Lutz S Nutrition and Diet Therapy 7th Edition Erin Mazur Ebook Newly Updated Full Text
100% (7)
Lutz S Nutrition and Diet Therapy 7th Edition Erin Mazur Ebook Newly Updated Full Text
36 pages
Deep Learning Models For Waterfowl Detection and C
No ratings yet
Deep Learning Models For Waterfowl Detection and C
13 pages
Survival Study On Medical Disease Diagnosis With Coronavirus Database
No ratings yet
Survival Study On Medical Disease Diagnosis With Coronavirus Database
9 pages
AICircuit: Benchmark for Analog Circuit AI
No ratings yet
AICircuit: Benchmark for Analog Circuit AI
19 pages
Automated ASD Detection via Deep Learning
No ratings yet
Automated ASD Detection via Deep Learning
15 pages
Sample Return Missions: The Last Frontier of Solar System Exploration 1st Edition Andrea Longobardo - Ebook PDF Ebook Student PDF Version
100% (2)
Sample Return Missions: The Last Frontier of Solar System Exploration 1st Edition Andrea Longobardo - Ebook PDF Ebook Student PDF Version
56 pages
NSL-KDD Dataset for IDS Evaluation
No ratings yet
NSL-KDD Dataset for IDS Evaluation
7 pages
Global Gourmet 3
No ratings yet
Global Gourmet 3
4 pages
Holistic Food Fraud Vulnerability Assessment
No ratings yet
Holistic Food Fraud Vulnerability Assessment
12 pages
Fingerprint Gender Detection App
No ratings yet
Fingerprint Gender Detection App
53 pages
Driver Distraction Detection with ResNet-50
No ratings yet
Driver Distraction Detection with ResNet-50
6 pages
AI-Driven Business Analysis Roadmap
No ratings yet
AI-Driven Business Analysis Roadmap
32 pages
AI Solutions and Conversational Workloads
No ratings yet
AI Solutions and Conversational Workloads
94 pages
Sarcasm Detection For Hindi English Code Mixed Twitter Data
No ratings yet
Sarcasm Detection For Hindi English Code Mixed Twitter Data
8 pages
Population Forecasting in India Using ML
No ratings yet
Population Forecasting in India Using ML
30 pages
Introduction to Deep Learning Concepts
No ratings yet
Introduction to Deep Learning Concepts
60 pages
Importance Sampling in Deep Learning
No ratings yet
Importance Sampling in Deep Learning
13 pages
Data Preprocessing with Python
No ratings yet
Data Preprocessing with Python
20 pages
Demand Brain™: AI Demand Forecasting Solutions
No ratings yet
Demand Brain™: AI Demand Forecasting Solutions
23 pages
Delivery Time Predictor (Mini Swiggy)
No ratings yet
Delivery Time Predictor (Mini Swiggy)
17 pages
Machine Learning Regression Techniques
No ratings yet
Machine Learning Regression Techniques
160 pages
Traffic Prediction with Machine Learning
No ratings yet
Traffic Prediction with Machine Learning
6 pages
Deep Learning in Froth Flotation Purity Prediction
No ratings yet
Deep Learning in Froth Flotation Purity Prediction
12 pages
Deep Learning-Based Intrusion Detection For Iot Networks: A Scalable and Efficient Approach
No ratings yet
Deep Learning-Based Intrusion Detection For Iot Networks: A Scalable and Efficient Approach
23 pages

Unit 3 AI&ML

Uploaded by

Unit 3 AI&ML

Uploaded by

CS3491

Introduction to machine learning – Linear Regression Models:

An ML model updating its predicted value

A regression model predicts a numeric value.

creating model by drawing a best ﬁt line through the points

(the slope of the line). The model would be deﬁned as 𝑦′=30+(−3.6)(𝑥1)

❏ , and use it to make predictions. For instance, using this model, a

The error (residual) —

MSE=⅓ (0.25+0.09+0.09) = 0.43 / 3 ≈0.143

Regression line / line of best ﬁt

4 77 77.17 -0.17 0.03

5 78 78.34 -0.34 0.12

6 81 79.52 1.48 2.19

7 80 80.69 -0.69 0.48

8 82 81.86 0.14 0.02

9 83 83.04 -0.04 0.00

Prior over Weights w

Posterior over Weights

3. Update Parameters: Adjust parameters in the opposite direction of the

❖ Sensitivity to Data Quality

❖ Handling Imbalanced Data

Correlation measures how features move together.

# Example output probabilities from a model

# Find the index of the maximum probability

A discriminant function is a function used in pattern classifiers

Modeling of distinctions between groups, effectively separating two

LDA operates by projecting features from a higher-dimensional

In machine learning, LDA serves as a supervised learning algorithm

It also assumes that the data is linearly separable, meaning that a

Two criteria are used by LDA to create a new axis:

Two criteria are used by LDA to create a new axis:

1: Maximize the distance between class means

2: Minimize the variation within each class

1. Project data onto a line

2. Class means after projection

3. Variance (scatter) within each class after projection

4. Fisher’s Criterion (Objective Function)

Logistic regression used to predict the probability of output being a

❏ Logistic regression is a supervised machine learning algorithm used for classiﬁcation

Types of Logistic Regression

1. Sigmoid Function and Probability Estimation

2. Logistic Regression Equation

4. Gradient descent to optimize parameters

Step 2: Initialize Parameters

Step 3: Define Cost Function (Log Loss)

Step 2: Initialize Parameters

Step 3: Define Cost Function (Log Loss)

Step 6: Make Predictions

SVM aims to ﬁnd the optimal hyperplane in an N-dimensional space to separate

❏ Internal nodes represent the features of a dataset.

Decision Node Leaf Node

True Negative Rate

False Positive Rate

❏ Creating Tree Nodes

❏ Handling Missing Values

Entropy Information Gain

You might also like