0% found this document useful (0 votes)
3 views45 pages

Supervised Learning: Classification Explained

The document provides an overview of supervised machine learning, detailing its types, including classification and regression, and explaining how models are trained using labeled data. It discusses various classification types such as binary, multiclass, and multi-label classification, along with real-life applications and examples. Additionally, it covers key concepts like decision boundaries, model evaluation, and specific algorithms used in classification tasks.

Uploaded by

hussaintallat547
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views45 pages

Supervised Learning: Classification Explained

The document provides an overview of supervised machine learning, detailing its types, including classification and regression, and explaining how models are trained using labeled data. It discusses various classification types such as binary, multiclass, and multi-label classification, along with real-life applications and examples. Additionally, it covers key concepts like decision boundaries, model evaluation, and specific algorithms used in classification tasks.

Uploaded by

hussaintallat547
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd

Classification

Types of Machine Learning


• 1. Supervised learning

• Supervised learning is a type of machine learning where a


model is trained on labeled data—
• meaning each input is paired with the correct output.
• The model learns by comparing its predictions with the actual
answers provided in the training data.
• The process is like a teacher guiding a student—hence the term
“supervised” learning
• Both classification and regression problems are supervised
learning problems.
What is Supervised Machine Learning?
• supervised learning is a type of machine learning where a
model is trained on labeled data—meaning each input is paired
with the correct output. the model learns by comparing its
predictions with the actual answers provided in the training data.
• Over time, it adjusts itself to minimize errors and improve
accuracy.
• The goal of supervised learning is to make accurate predictions
when given new, unseen data.
• For example, if a model is trained to recognize handwritten digits,
it will use what it learned to correctly identify new numbers it
hasn’t seen before.
• Supervised learning can be applied in various forms,
including supervised learning classification and supervised
learning regression, making it a crucial technique in the field of
• A fundamental concept in supervised machine learning is
learning a class from examples.
• This involves providing the model with examples where the
correct label is known, such as learning to classify images of cats
and dogs by being shown labeled examples of both.
• The model then learns the distinguishing features of each class
and applies this knowledge to classify new images.
• Following are some real-life use cases of supervised
learning −

• Image Classification
• Spam Filtering
• House Price Prediction
• Signature Recognition
• Weather Forecasting
• Stock price prediction
• Types of Supervised Learning in Machine Learning
• Now, Supervised learning can be applied to two main types of
problems:
• Classification: Where the output is a categorical variable (e.g.,
spam vs. non-spam emails, yes vs. no).

• Regression: Where the output is a continuous variable (e.g.,


predicting house prices, stock prices).
• While training the model, data is usually split in the ratio of
80:20 i.e. 80% as training data and the rest as testing data.
• In training data, we feed input as well as output for 80% of data.
• The model learns from training data only.
• We use different supervised learning algorithms (discuss
later ) to build our model.
• Let’s first understand the classification and regression data
through the table below
• Both the above figures have labelled data set as follows:
• Figure A: It is a dataset of a shopping store that is useful in
predicting whether a customer will purchase a particular product
under consideration or not based on his/ her gender, age, and
salary.
Input: Gender, Age, Salary
Output: Purchased i.e. 0 or 1; 1 means yes the customer will
purchase and 0 means that the customer won’t purchase it.

• Figure B: It is a Meteorological dataset that serves the purpose


of predicting wind speed based on different parameters.
Input: Dew Point, Temperature, Pressure, Relative Humidity,
Wind Direction
Output: Wind Speed
• Classification
• Classification deals with predicting categorical target variables, which
represent discrete classes or labels.
• For instance, classifying emails as spam or not spam, or predicting whether a
patient has a high risk of heart disease.
• Classification algorithms learn to map the input features to one of the predefined
classes.
• Here are some classification algorithms:
• Logistic Regression

• Support Vector Machine

• Random Forest

• Decision Tree

• K-Nearest Neighbors (KNN)


• Regression
• Regression, on the other hand, deals with predicting continuous target
variables, which represent numerical values.
• For example, predicting the price of a house based on its size, location, and
amenities, or forecasting the sales of a product.
• Regression algorithms learn to map the input features to a continuous
numerical value.
• Here are some regression algorithms:
• Linear Regression

• Polynomial Regression

• Ridge Regression

• Lasso Regression

• Decision tree
Classification
• Classification teaches a machine to sort things into categories.
• It learns by looking at examples with labels (like emails
marked “spam” or “not spam”).
• After learning, it can decide which category new items belong
to, like identifying if a new email is spam or not.
• For example, a classification model might be trained on
dataset of images labeled as either dogs or cats and it can be
used to predict the class of new and unseen images as dogs or
cats based on their features such as color, texture and shape
Explaining classification in ml, horizontal axis represents
the combined values of color and texture features.
Vertical axis represents the combined values of shape and
size features.
• Each colored dot in the plot represents an individual
image, with the color indicating whether the model
predicts the image to be a dog or a cat.
• The shaded areas in the plot show the decision
boundary, which is the line or region that the model uses
to decide which category (dog or cat) an image belongs to.
• The model classifies images on one side of the boundary
as dogs and on the other side as cats, based on their
features.
Types of Classification
• When we talk about classification in machine learning, we’re
talking about the process of sorting data into categories based
on specific features or characteristics.
• There are different types of classification problems depending
on how many categories (or classes) we are working with and
how they are organized.
• There are two main classification types in machine learning
• 1. Binary Classification
• 2. Multiclass Classification
Binary Classification
• This is the simplest kind of classification. In binary classification,
the goal is to sort the data into two distinct categories.
• Think of it like a simple choice between two options.
• Imagine a system that sorts emails into either spam or not
spam.
• It works by looking at different features of the email like
certain keywords or sender details, and decides whether it’s
spam or not.
• Determining if a customer will click on an ad or not.
• Identifying if a credit card transaction is fraudulent or legitimate
• It only chooses between these two options.
2. Multiclass Classification

Here, instead of just two categories, the data needs to be sorted


into more than two categories.
The model picks the one that best matches the input.
Think of an image recognition system that sorts pictures of
animals into categories like cat, dog, and bird.
• Basically, machine looks at the features in the image (like
shape, color, or texture) and chooses which animal the
picture is most likely to be based on the training it
received
3. Multi-Label Classification
• In multi-label classification single piece of data can belong to
multiple categories at once.
• Unlike multiclass classification where each data point belongs to
only one class, multi-label classification allows datapoints to
belong to multiple classes.
• A movie recommendation system could tag a movie as both
action and comedy. The system checks various features (like
movie plot, actors, or genre tags) and assigns multiple labels to
a single piece of data, rather than just one.
How does Classification in Machine Learning Work?
Classification involves training a model using a labeled dataset,
where each input is paired with its correct output label.
The model learns patterns and relationships in the data, so it can
later predict labels for new, unseen inputs.
• In machine learning, classification works by training a model
to learn patterns from labeled data, so it can predict the
category or class of new, unseen data. Here’s how it works:
[Link] Collection: You start with a dataset where each item is labeled with
the correct class (for example, “cat” or “dog”).

[Link] Extraction: The system identifies features (like color, shape, or


texture) that help distinguish one class from another. These features are
what the model uses to make predictions.

[Link] Training: Classification – machine learning algorithm uses the


labeled data to learn how to map the features to the correct class. It looks
for patterns and relationships in the data.

[Link] Evaluation: Once the model is trained, it’s tested on new, unseen
data to check how accurately it can classify the items.

[Link]: After being trained and evaluated, the model can be used to
predict the class of new data based on the features it has learned.

[Link] Evaluation: Evaluating a classification model is a key step in


machine learning. It helps us check how well the model performs and how
good it is at handling new, unseen data. Depending on the problem and
• If the quality metric is not satisfactory, the ML
algorithm or hyperparameters can be adjusted, and
the model is retrained.
• This iterative process continues until a satisfactory
performance is achieved.
• In short, classification in machine learning is all
about using existing labeled data to teach the model
how to predict the class of new, unlabeled data based
on the patterns it has learned.
Examples of Machine Learning Classification in
Real Life

Classification algorithms are widely used in many real-world


applications across various domains, including:
• Email spam filtering

• Credit risk assessment: Algorithms predict whether a loan applicant


is likely to default by analyzing factors such as credit score, income,
and loan history. This helps banks make informed lending decisions
and minimize financial risk.

• Medical diagnosis : Machine learning models classify whether a


patient has a certain condition (e.g., cancer or diabetes) based on
medical data such as test results, symptoms, and patient history. This
aids doctors in making quicker, more accurate diagnoses, improving
patient care.
• Image classification : Applied in fields such as facial
recognition, autonomous driving, and medical imaging.

• Sentiment analysis: Determining whether the sentiment of a


piece of text is positive, negative, or neutral. Businesses use this
to understand customer opinions, helping to improve products
and services.

• Fraud detection : Algorithms detect fraudulent activities by


analyzing transaction patterns and identifying anomalies crucial
in protecting against credit card fraud and other financial crimes.

• Recommendation systems : Used to recommend products or


content based on past user behavior, such as suggesting movies
on Netflix or products on Amazon. This personalization boosts
user satisfaction and sales for businesses.
Classification Modeling in Machine
Learning
• Classification modeling refers to the process of using
machine learning algorithms to categorize data into
predefined classes or labels.
• These models are designed to handle both binary and
multi-class classification tasks, depending on the nature
of the problem.
• Let’s see key characteristics of Classification Models:
[Link] Separation: Classification relies on distinguishing between distinct
classes. The goal is to learn a model that can separate or categorize data
points into predefined classes based on their features.

[Link] Boundaries: The model draws decision boundaries in the feature


space to differentiate between classes. These boundaries can be linear or
non-linear.

[Link] to Data Quality: Classification models are sensitive to the


quality and quantity of the training data. Well-labeled, representative data
ensures better performance, while noisy or biased data can lead to poor
predictions.

[Link] Imbalanced Data: Classification problems may face challenges


when one class is underrepresented. Special techniques like resampling or
weighting are used to handle class imbalances.

[Link]: Some classification algorithms, such as Decision Trees,


offer higher interpretability, meaning it’s easier to understand why a model
made a particular prediction.
Classification Algorithms
Now, for implementation of any classification model it is essential to
understand Logistic Regression, which is one of the most fundamental
and widely used algorithms in machine learning for classification tasks.
There are various types of classifiers algorithms. Some of them are :
• Linear Classifiers: Linear classifier models create a linear decision
boundary between classes. They are simple and computationally efficient.

• Some of the linear classification models are as follows:

• Logistic Regression

• Support Vector Machines having kernel = ‘linear’

• Single-layer Perceptron

• Stochastic Gradient Descent (SGD) Classifier


Non-linear Classifiers: Non-linear models create a non-linear
decision boundary between classes.
They can capture more complex relationships between input
features and target variable. Some of the non-
linear classification models are as follows:
• K-Nearest Neighbours

• Kernel SVM

• Naive Bayes

• Decision Tree Classification

• Random Forests,

• AdaBoost,
Linear Regression in Machine learning
• Linear regression is a statistical method used to model the
relationship between a dependent variable and one or more
independent variables.
• Linear regression is a type of algorithm that tries to find the linear
relation between input features and output values for the prediction
of future events. This algorithm is widely used to perform stock
analysis, weather forecasting and others.
• It provides valuable insights for prediction and data analysis.
• Linear regression is also a type of supervised machine-learning
algorithm that learns from the labelled datasets and maps the data
points with most optimized linear functions which can be used for
prediction on new datasets.
• It computes the linear relationship between the dependent variable
and one or more independent features by fitting a linear equation
with observed data. It predicts the continuous output variables based
• For linear regression in machine learning, we represent features
as independent variables and target values as the dependent
variable
• For example if we want to predict house price we consider
various factor such as house age, distance from the main road,
location, area and number of room, linear regression uses all
these parameter to predict house price as it consider a linear
relation between all these features and price of house.
For the simplicity, take the following data (Single feature and single target )
Square Feet House Price
(X) (Y)
1300 240
1500 320
1700 330
1830 295
1550 256
2350 409
• In the above data,
• the target House Price is the dependent variable represented by
X,
• and the feature, Square Feet, is the independent variable
represented by Y.
• The input features (X) are used to predict the target label (Y). So,
the independent variables are also known as predictor variables,
and the dependent variable is known as the response variable.
in machine learning, linear regression uses a linear equation to
model the relationship between a dependent variable (Y) and
one or more independent variables (Y).
• The main goal of the linear regression model is to find the best-
fitting straight line (often called a regression line) through a set
of data points.
Line of Regression
• A straight line that shows a relation between the dependent
variable and independent variables is known as the line of
regression or regression line
• Furthermore, the linear relationship can be positive or negative
in nature as explained below −
• 1. Positive Linear Relationship
• A linear relationship will be called positive if both independent
and dependent variable increases. It can be understood with the
help of the following graph −
2. Negative Linear Relationship
• A linear relationship will be called positive if the independent
increases and the dependent variable decreases. It can be
understood with the help of the following graph
• Linear regression is of two types, "simple linear regression" and
"multiple linear regression“
1. Simple Linear Regression
• Simple linear regression is a type of regression analysis in which
a single independent variable (also known as a predictor variable) is
used to predict the dependent variable. In other words, it models the
linear relationship between the dependent variable and a single
independent variable.
• In the above image, the straight line represents the simple linear
regression line where Ŷ is the predicted value, and X is the input
value.

• Mathematically, the relationship can be modeled as a linear


equation −

Y=W0+W1X+ϵ
• Where
• Y is the dependent variable (target).
• X is the independent variable (feature).
• w0 is the y-intercept of the line.
• w1 is the slope of the line, representing the effect of X on Y.
• ε is the error term, capturing the variability in Y not explained by X
• 2. Multiple Linear Regression
• Multiple linear regression is basically the extension of simple linear regression
that predicts a response using two or more features.
• When dealing with more than one independent variable, we extend simple linear
regression to multiple linear regression. The model is expressed as:

• Multiple linear regression extends the concept of simple linear regression to


multiple independent variables. The model is expressed as:

• Y=w0+w1X1+w2X2+⋯+wpXp+ϵ

• Where

• X1, X2, ..., Xp are the independent variables (features).


• w0, w1, ..., wp are the coefficients for these variables.
• ε is the error term.
How Does Linear Regression Work?
• The main goal of linear regression is to find the best-fit line through a set of data
points that minimizes the difference between the actual values and predicted
values. So it is done? This is done by estimating the parameters w0, w1 etc.

• The working of linear regression in machine learning can be broken down into
many steps as follows −

• Hypothesis− We assume that there is a linear relation between input and


output.
• Cost Function − Define a loss or cost function. The cost function quantifies the
model's prediction error. The cost function takes the model's predicted values and
actual values and returns a single scaler value that represents the cost of the
model's prediction.
• Optimization − Optimize (minimize) the model's cost function by updating the
model's parameters.
• It continues updating the model's parameters until the cost or error of the
model's prediction is optimized (minimized).
• Hypothesis Function For Linear Regression
• In linear regression problems, we assume that there is a linear relationship
between input features (X) and predicted value (Ŷ).

• The hypothesis function returns the predicted value for a given input value.
Generally we represent a hypothesis by hw(X) and it is equal to Ŷ.

• Hypothesis function for simple linear regression −

• ^=w0+w1X

• Hypothesis function for multiple linear regression −

• Y^=w0+w1X1+w2X2+⋯+wpXp

• For different values of parameters (weights), we can find many regression lines.
The main goal is to find the best-fit lines. Let's discuss it as below −
What is the best Fit Line?
Our primary objective while using
linear regression is to locate the best-
fit line, which implies that the error
between the predicted and actual
values should be kept to a minimum.
There will be the least error in the
best-fit line.
• The best Fit Line equation provides a
straight line that represents the
relationship between the dependent
and independent variables.
• The slope of the line indicates how
much the dependent variable
changes for a unit change in the
independent variable(s).
Here Y is called a dependent or target variable and X is called an
independent variable also known as the predictor of Y.
There are many types of functions or modules that can be used
for regression.
A linear function is the simplest type of function. Here, X may be a
single feature or multiple features representing the problem.
• Linear regression performs the task to predict a dependent
variable value (y) based on a given independent variable (x)).
• Hence, the name is Linear Regression.
• In the figure above, X (input) is the work experience and Y
(output) is the salary of a person. The regression line is the best-
fit line for our model.
• In linear regression some hypothesis are made to ensure
reliability of the model’s results.
• So, how can we minimize the error between the actual
and predicted values?
• Let's discuss the important concept, which is cost
function or loss function.

You might also like