0% found this document useful (0 votes)
14 views44 pages

Understanding Supervised Machine Learning

Uploaded by

en24ca5030040
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views44 pages

Understanding Supervised Machine Learning

Uploaded by

en24ca5030040
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd

Supervised Machine Learning

• Supervised machine learning is the type of machine learning in which


machines are trained using well "labelled" training data, and on basis of that
data, machines predict the output. The labelled data means some input data
is already tagged with the correct output.
• In supervised learning, the training data provided to the machines work as
the supervisor that teaches the machines to predict the output correctly. It
applies the same concept as a student learns in the supervision of the
teacher.
• Supervised learning is a process of providing input data as well as correct
output data to the machine learning model. The aim of a supervised learning
algorithm is to find a mapping function to map the input variable(x)
with the output variable(y).
• In the real-world, supervised learning can be used for Risk Assessment,
Image classification, Fraud Detection, spam filtering, etc.
How Supervised Learning Works?

2 / 55
Types of supervised Machine learning Algorithms:
Regression

• Regression algorithms are used if there is a relationship between


the input variable and the output variable. It is used for the
prediction of continuous variables, such as Weather forecasting,
Market Trends, etc.
• Some popular Regression algorithms which come under
supervised learning:
Linear Regression
Regression Trees
Non-Linear Regression
Bayesian Linear Regression
Polynomial Regression
Classification
• Classification algorithms are used when the output
variable is categorical, which means there are two
classes such as Yes-No, Male-Female, True-false, etc.
• Classification Algorithms:
Spam Filtering,
Random Forest
Decision Trees
Logistic Regression
Support vector Machines
Advantages of Supervised learning:
• With the help of supervised learning, the model can
predict the output on the basis of prior experiences.
• In supervised learning, we can have an exact idea about
the classes of objects.
• Supervised learning model helps us to solve various
real-world problems such as fraud detection, spam
filtering, etc.
Disadvantages of supervised
learning:
• Supervised learning models are not suitable for
handling the complex tasks.
• Supervised learning cannot predict the correct output if
the test data is different from the training dataset.
• Training required lots of computation times.
• In supervised learning, we need enough knowledge
about the classes of object.
Linear regression

• A machine learning model tries to capture the relationship


between the input features and the output features.
• Imagine, we have the data having heights and weights of
thousands of people. We want to use this data to create a
Machine Learning model that takes the height of a person as
input and predicts the weight of the person.
• The relationship between weight(in pounds) and
height(in inches) of people.
• The relationship looks linear i. e. with an increase in
height, the weight also increases.
• This kind of relationship between the input feature(height) and output
feature(weight) can be captured by a linear regression model that tries to fit a
straight line on this data.
The equation of a line of a simple linear regression model is:
Y = mx + c
For different values of slope ‘m’ and constant ‘c’, we will get different lines:
How does a linear regression find the best fit line?

Cost Function
A cost function is a mathematical function that is
minimized to get the optimal values of slope ‘m’ and
constant ‘c’. The cost function associated with linear
regression is called the mean squared errors and can be
represented as below:
Error: Difference between actual and predicted.
• Suppose the actual weight and predicted weights are as
follows:

Actual vs Predicted Weights


• We can calculate the MSE as follows:

The updated sum of square error


• Now, we need to get the optimal values of ‘m’
and ‘c’ so that MSE becomes minimum. Intuitively,
we want the predicted weights to be as close as
possible to the actual weights.
• But how do we do that? There could be a huge number
of combinations of ‘m’ and ‘c’, we cannot test them all.

• So, Gradient Descent is the solution!


Gradient Descent

• Gradient Descent is a technique to minimize the


outcome of a function, which is the Mean squared error
in the case of linear regression.
In order to do this, it requires two data points—a direction and a
learning rate. These factors determine the partial derivative
calculations of future iterations, allowing it to gradually arrive at
the local or global minimum (i.e. point of convergence).
How Gradient descent works

The cost function of linear regression(MSE) is a convex function i.e.


it has only one minima across the range of values of slope ‘m’ and
constant ‘c’ as shown in the figure (cost function is represented by
J(m, c)).
Mathematically, the Gradient Descent works by calculating
the partial derivative or slope corresponding to the current value
of ‘m’ and ‘c’ as shown below. At each step, the value of both ‘m’
and ‘c’ get updated simultaneously. The values will keep on updating
until we reach the value of ‘m’ and ‘c’ for which the cost function
reaches the minimum value.

Learning rate(or alpha) is the rate at which the value of ‘m’ or ‘c’ get updated. The larger
the value of alpha, the bigger will be the update to the value of ‘m’ or ‘c’.
• Learning rate (also referred to as step size or the alpha) is the
size of the steps that are taken to reach the minimum. This is
typically a small value, and it is evaluated and updated based on
the behavior of the cost function.
• High learning rates result in larger steps but risks overshooting
the minimum.
• Conversely, a low learning rate has small step sizes. While it has
the advantage of more precision, the number of iterations
compromises overall efficiency as this takes more time and
computations to reach the minimum.
• If the slope is negative, then the value of ‘m’ increases
by — learning rate * slope of ‘m’ and if the slope is
positive, then the value of ‘m’ decreases by — learning
rate * slope of ‘m’. The same is true for the value of c.
Classification

Classification algorithms are used when the output variable is categorical, which
means there are two classes such as Yes-No, Male-Female, True-false, etc.
Classification Algorithms:
• Linear Models
• Logistic Regression
• Support Vector Machines
• Non-linear Models
• K-Nearest Neighbours
• Kernel SVM
• Naïve Bayes
• Decision Tree Classification
• Random Forest Classification
Use cases of Classification Algorithms

• Email Spam Detection


• Speech Recognition
• Identifications of Cancer tumor cells.
• Drugs Classification
• Biometric Identification, etc.
Classification Example
Logistic Regression

Logistic Regression is a popular classification algorithm


used in machine learning to predict the probability that
an instance belongs to a particular class. Despite its
name, it is a classification algorithm, not a regression
algorithm.

This is an explainable algorithm. It classifies a data point


by modeling its probability of belonging to a given class
using the sigmoid function.
Sigmoid Function

At the core of Logistic Regression lies the Sigmoid


Function, also known as the Logistic Function. The
Sigmoid Function transforms the output of the linear
equation into a value between 0 and 1. The formula for
the Sigmoid Function is:
Age Have_insurance
22 0
25 0
47 1
52 0
46 1
56 1
55 0
60 1
62 1
61 1
18 0
28 0
27 0
29 0
49 1

You might also like