0% found this document useful (0 votes)

16 views15 pages

Unit Iv ML

This document discusses performance measurement metrics for evaluating machine learning models, focusing on classification and regression tasks. Key metrics include accuracy, confusion matrix, precision, recall, F1 score, AUC-ROC for classification, and mean absolute error, mean squared error, and R-squared for regression. It emphasizes the importance of selecting appropriate metrics based on the problem type and the distribution of classes in the data.

Uploaded by

chiman Saini

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views15 pages

Unit Iv ML

Uploaded by

chiman Saini

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

UNIT 4

Performance measurement of models in terms of accuracy, confusion matrix, precision & recall, F1
score, receiver Operating Characteristic Curve (ROC) curve and AUC, Median absolute deviation
(MAD), Distribution of errors.

Performance Metrics in Machine Learning

Evaluating the performance of a Machine learning model is one of the important
steps while building an effective ML model. To evaluate the performance or
quality of the model, different metrics are used, and these metrics are known
as performance metrics or evaluation metrics. These performance metrics help
us understand how well our model has performed for the given data. In this way, we
can improve the model's performance by tuning the hyper-parameters. Each ML
model aims to generalize well on unseen/new data, and performance metrics help
determine how well the model generalizes on the new dataset.

In machine learning, each task or problem is divided

into classification and Regression. Not all metrics can be used for all types of
problems; hence, it is important to know and understand which metrics should be
used. Different evaluation metrics are used for both Regression and Classification
tasks. In this topic, we will discuss metrics used for classification and regression
tasks.

1. Performance Metrics for Classification

In a classification problem, the category or classes of data is identified based on
training data. The model learns from the given dataset and then classifies the new
data into classes or groups based on the training. It predicts class labels as the
output, such as Yes or No, 0 or 1, Spam or Not Spam, etc. To evaluate the
performance of a classification model, different metrics are used, and some of them
are as follows:

o Accuracy
o Confusion Matrix
o Precision
o Recall
o F-Score
o AUC(Area Under the Curve)-ROC

I. Accuracy
The accuracy metric is one of the simplest Classification metrics to implement, and it
can be determined as the number of correct predictions to the total number of
predictions.

It can be formulated as:

To implement an accuracy metric, we can compare ground truth and predicted

values in a loop, or we can also use the scikit-learn module for this.

Firstly, we need to import the accuracy_score function of the scikit-learn library as

follows:

1. from [Link] import accuracy_score

2.
3. Here, metrics is a class of sklearn.
4.
5.

Then we need to pass the ground truth and predicted values in the function to
calculate the accuracy.
6.
7. print(f'Accuracy Score is {accuracy_score(y_test,y_hat)}')

Although it is simple to use and implement, it is suitable only for cases where an
equal number of samples belong to each class.

When to Use Accuracy?

It is good to use the Accuracy metric when the target variable classes in data are
approximately balanced. For example, if 60% of classes in a fruit image dataset are
of Apple, 40% are Mango. In this case, if the model is asked to predict whether the
image is of Apple or Mango, it will give a prediction with 97% of accuracy.

When not to use Accuracy?

It is recommended not to use the Accuracy measure when the target variable majorly
belongs to one class. For example, Suppose there is a model for a disease
prediction in which, out of 100 people, only five people have a disease, and 95
people don't have one. In this case, if our model predicts every person with no
disease (which means a bad prediction), the Accuracy measure will be 95%, which is
not correct.

II. Confusion Matrix

A confusion matrix is a tabular representation of prediction outcomes of any binary
classifier, which is used to describe the performance of the classification model on a
set of test data when true values are known.

The confusion matrix is simple to implement, but the terminologies used in this
matrix might be confusing for beginners.

A typical confusion matrix for a binary classifier looks like the below image(However,
it can be extended to use for classifiers with more than two classes).

We can determine the following from the above matrix:

o In the matrix, columns are for the prediction values, and rows specify the
Actual values. Here Actual and prediction give two possible classes, Yes or
No. So, if we are predicting the presence of a disease in a patient, the
Prediction column with Yes means, Patient has the disease, and for NO, the
Patient doesn't have the disease.
o In this example, the total number of predictions are 165, out of which 110 time
predicted yes, whereas 55 times predicted No.
o However, in reality, 60 cases in which patients don't have the disease,
whereas 105 cases in which patients have the disease.

In general, the table is divided into four terminologies, which are as follows:

1. True Positive(TP): In this case, the prediction outcome is true, and it is true in
reality, also.
2. True Negative(TN): in this case, the prediction outcome is false, and it is false
in reality, also.
3. False Positive(FP): In this case, prediction outcomes are true, but they are
false in actuality.
4. False Negative(FN): In this case, predictions are false, and they are true in
actuality.

III. Precision
The precision metric is used to overcome the limitation of Accuracy. The precision
determines the proportion of positive prediction that was actually correct. It can be
calculated as the True Positive or predictions that are actually true to the total
positive predictions (True Positive and False Positive).

IV. Recall or Sensitivity

It is also similar to the Precision metric; however, it aims to calculate the proportion
of actual positive that was identified incorrectly. It can be calculated as True Positive
or predictions that are actually true to the total number of positives, either correctly
predicted as positive or incorrectly predicted as negative (true Positive and false
negative).

The formula for calculating Recall is given below:

When to use Precision and Recall?

From the above definitions of Precision and Recall, we can say that recall
determines the performance of a classifier with respect to a false negative, whereas
precision gives information about the performance of a classifier with respect to a
false positive.

So, if we want to minimize the false negative, then, Recall should be as near to
100%, and if we want to minimize the false positive, then precision should be close
to 100% as possible.

In simple words, if we maximize precision, it will minimize the FP errors, and if we

maximize recall, it will minimize the FN error.

V. F-Scores
F-score or F1 Score is a metric to evaluate a binary classification model on the basis
of predictions that are made for the positive class. It is calculated with the help of
Precision and Recall. It is a type of single score that represents both Precision and
Recall. So, the F1 Score can be calculated as the harmonic mean of both
precision and Recall, assigning equal weight to each of them.

The formula for calculating the F1 score is given below:

When to use F-Score?

As F-score make use of both precision and recall, so it should be used if both of
them are important for evaluation, but one (precision or recall) is slightly more
important to consider than the other. For example, when False negatives are
comparatively more important than false positives, or vice versa.

VI. AUC-ROC
Sometimes we need to visualize the performance of the classification model on
charts; then, we can use the AUC-ROC curve. It is one of the popular and important
metrics for evaluating the performance of the classification model.

Firstly, let's understand ROC (Receiver Operating Characteristic curve) curve. ROC
represents a graph to show the performance of a classification model at
different threshold levels. The curve is plotted between two parameters, which are:

o True Positive Rate

o False Positive Rate

TPR or true Positive rate is a synonym for Recall, hence can be calculated as:
FPR or False Positive Rate can be calculated as:

To calculate value at any point in a ROC curve, we can evaluate a logistic regression
model multiple times with different classification thresholds, but this would not be
much efficient. So, for this, one efficient method is used, which is known as AUC.

AUC: Area Under the ROC curve

AUC is known for Area Under the ROC curve. As its name suggests, AUC
calculates the two-dimensional area under the entire ROC curve, as shown below
image:

AUC calculates the performance across all the thresholds and provides an
aggregate measure. The value of AUC ranges from 0 to 1. It means a model with
100% wrong prediction will have an AUC of 0.0, whereas models with 100% correct
predictions will have an AUC of 1.0.

When to Use AUC

AUC should be used to measure how well the predictions are ranked rather than
their absolute values. Moreover, it measures the quality of predictions of the model
without considering the classification threshold.

When not to use AUC

As AUC is scale-invariant, which is not always desirable, and we need calibrating
probability outputs, then AUC is not preferable.

Further, AUC is not a useful metric when there are wide disparities in the cost of
false negatives vs. false positives, and it is difficult to minimize one type of
classification error.

2. Performance Metrics for Regression

Regression is a supervised learning technique that aims to find the relationships
between the dependent and independent variables. A predictive regression model
predicts a numeric or discrete value. The metrics used for regression are different
from the classification metrics. It means we cannot use the Accuracy metric
(explained above) to evaluate a regression model; instead, the performance of a
Regression model is reported as errors in the prediction. Following are the popular
metrics that are used to evaluate the performance of Regression models.

o Mean Absolute Error

o Mean Squared Error
o R2 Score
o Adjusted R2

I. Mean Absolute Error (MAE)

Mean Absolute Error or MAE is one of the simplest metrics, which measures the
absolute difference between actual and predicted values, where absolute means
taking a number as Positive.

To understand MAE, let's take an example of Linear Regression, where the model
draws a best fit line between dependent and independent variables. To measure the
MAE or error in prediction, we need to calculate the difference between actual values
and predicted values. But in order to find the absolute error for the complete dataset,
we need to find the mean absolute of the complete dataset.

The below formula is used to calculate MAE:

Here,

Y is the Actual outcome, Y' is the predicted outcome, and N is the total number of
data points.

MAE is much more robust for the outliers. One of the limitations of MAE is that it is
not differentiable, so for this, we need to apply different optimizers such as Gradient
Descent. However, to overcome this limitation, another metric can be used, which is
Mean Squared Error or MSE.

II. Mean Squared Error

Mean Squared error or MSE is one of the most suitable metrics for Regression
evaluation. It measures the average of the Squared difference between predicted
values and the actual value given by the model.

Since in MSE, errors are squared, therefore it only assumes non-negative values,
and it is usually positive and non-zero.

Moreover, due to squared differences, it penalizes small errors also, and hence it
leads to over-estimation of how bad the model is.

MSE is a much-preferred metric compared to other regression metrics as it is

differentiable and hence optimized better.

The formula for calculating MSE is given below:

Here,

Y is the Actual outcome, Y' is the predicted outcome, and N is the total number of
data points.

III. R Squared Score

R squared error is also known as Coefficient of Determination, which is another
popular metric used for Regression model evaluation. The R-squared metric enables
us to compare our model with a constant baseline to determine the performance of
the model. To select the constant baseline, we need to take the mean of the data
and draw the line at the mean.

The R squared score will always be less than or equal to 1 without concerning if the
values are too large or small.

IV. Adjusted R Squared

Adjusted R squared, as the name suggests, is the improved version of R squared
error. R square has a limitation of improvement of a score on increasing the terms,
even though the model is not improving, and it may mislead the data scientists.
To overcome the issue of R square, adjusted R squared is used, which will always
show a lower value than R². It is because it adjusts the values of increasing
predictors and only shows improvement if there is a real improvement.

We can calculate the adjusted R squared as follows:

Here,

n is the number of observations

k denotes the number of independent variables

and Ra2 denotes the adjusted R2

Mean Absolute Deviation


Mean Absolute Deviation is one of the metrics of statistics that helps us find
out the average spread of the data i.e., Mean Absolute Deviation shows the
average distance of the observation of the dataset from the mean of the
dataset. It is helpful in the analysis of data and understanding of the data with
a much better understanding. Mean Absolute Deviation is one of the
measures of the spread which include other measures i.e., range, quartiles,
interquartile range, standard deviation, and variance.

What is Mean Absolute Deviation?

Mean Absolute Deviation (MAD) of a data set is the average distance between
each data point of the data set and the mean of data. i.e. it represents the
amount of variation that occurs around the mean value in the data set. It is
also a measure of spread. It is calculated as the average of the sum of the
absolute difference between each value of the data set and the mean.

What is Measure of Spread?

The measure of spread represents the amount of dispersion in a data set. i.e.,
how spread out are the values of the dataset around the central value
(example- mean/mode/median). It tells how far away the data points tend to
fall from the central value.
 The lower value of the measure of spread reflects that the data
points are close to the central value. In this case, the values in a
data set are more consistent.
 Further, the distance of the data points from the central value, the
greater the spread. whereas here, the values are not much
consistent.
Using the above diagram, we can infer that the narrow distribution represents
a lower spread, and the broad distribution represents a higher spread.

Mean Absolute Deviation Formula

As Mean Absolute Deviation is the average of the absolute value of deviation
about the mean of the data, its formula for grouped as well as ungrouped data
is given as follows:

For Ungrouped Data

The Mean Absolute Deviation Formula for ungrouped data is given as follows:
The Mean Absolute Deviation Formula for ungrouped data is given as
follows:

where,
 x represents the each observation of the dataset,
i

 μ is the mean of the data set, and

 n is the number of observations in the data set.

For Grouped Data

The Mean Absolute Deviation Formula for grouped data is given as follows:
Where,
 x represents the each observation of the dataset,
i

 is mean of dataset
 f represents frequency of corresponding observation x ,
i i

 1 < i < n and n is the number of data points in the data set.

How to Calculate Mean Absolute Deviation?
To calculate the mean absolute deviation for a set of values, we can use the
following steps:
Step 1: Identify whether the data set is either grouped or ungrouped and
calculate the Mean.
Step 2: Calculate the absolute difference between each data point and the
mean.
Step 3: Add the Absolute Difference calculated for each data point in the step
2.
Step 4: Dividing the sum of absolute difference by the number of data points
given to calculate the mean abosolute deviation.
Using these steps, we can calculate the Mean Absolute Deviation of any
dataset either grouped or ungrouped.

Mean Absolute Deviation vs. Standard Deviation

There are some differences between Mean Absolute Deviation and Standard
Deviation, which are as follows:

Mean Absolute
Parameters Deviation Standard Deviation

The average distance

The measure of how spread out the
between each
data is from the mean.
Definition data point and the mean.

1. Calculate the mean of the

1. Calculate the mean of the data set.
data set.
2. Calculate the difference between
2. Calculate the absolute
each data point and the mean.
value of the difference
3. Square each of those differences.
between each data point
4. Take the average of the squared
and the mean.
differences.
3. Take the average of those
5. Take the square root of the result.
Calculation absolute values.
Useful when the data set Useful when the data set does not
contains outliers, as it contain outliers,
is not affected by extreme as it provides a more accurate
Use values. measure of the spread of the data.

Example: Mean Absolute Deviation About the Mean

Suppose that we start with the following data set:

1, 2, 2, 3, 5, 7, 7, 7, 7, 9.

The mean of this data set is 5. The following table will organize our work in
calculating the mean absolute deviation about the mean.

Data Deviation from mean Absolute Value of

Value Deviation
1 1 - 5 = -4 |-4| = 4
2 2 - 5 = -3 |-3| = 3
2 2 - 5 = -3 |-3| = 3
3 3 - 5 = -2 |-2| = 2
5 5-5=0 |0| = 0
7 7-5=2 |2| = 2
7 7-5=2 |2| = 2
7 7-5=2 |2| = 2
7 7-5=2 |2| = 2
9 9-5=4 |4| = 4
Total of Absolute 24
Deviations:
We now divide this sum by 10, since there are a total of ten data values. The
mean absolute deviation about the mean is 24/10 = 2.4.

Example: Mean Absolute Deviation About the Mean

Now we start with a different data set:

1, 1, 4, 5, 5, 5, 5, 7, 7, 10.

Just like the previous data set, the mean of this data set is 5.

Data Deviation from mean Absolute Value of

Value Deviation
1 1 - 5 = -4 |-4| = 4
1 1 - 5 = -4 |-4| = 4
4 4 - 5 = -1 |-1| = 1
5 5-5=0 |0| = 0
5 5-5=0 |0| = 0
5 5-5=0 |0| = 0
5 5-5=0 |0| = 0
7 7-5=2 |2| = 2
7 7-5=2 |2| = 2
10 10 - 5 = 5 |5| = 5
Total of Absolute 18
Deviations:

Thus the mean absolute deviation about the mean is 18/10 = 1.8. We compare
this result to the first example. Although the mean was identical for each of
these examples, the data in the first example was more spread out. We see
from these two examples that the mean absolute deviation from the first
example is greater than the mean absolute deviation from the second example.
The greater the mean absolute deviation, the greater the dispersion of our
data.

Example: Mean Absolute Deviation About the Median

Start with the same data set as the first example:

1, 2, 2, 3, 5, 7, 7, 7, 7, 9.

The median of the data set is 6. In the following table, we show the details of
the calculation of the mean absolute deviation about the median.

Data Deviation from median Absolute Value of

Value Deviation
1 1 - 6 = -5 |-5| = 5
2 2 - 6 = -4 |-4| = 4
2 2 - 6 = -4 |-4| = 4
3 3 - 6 = -3 |-3| = 3
5 5 - 6 = -1 |-1| = 1
7 7-6=1 |1| = 1
7 7-6=1 |1| = 1
7 7-6=1 |1| = 1
7 7-6=1 |1| = 1
9 9-6=3 |3| = 3
Total of Absolute 24
Deviations:

Again we divide the total by 10 and obtain a mean average deviation about the
median as 24/10 = 2.4.

Example: Mean Absolute Deviation About the Median

Start with the same data set as before:

1, 2, 2, 3, 5, 7, 7, 7, 7, 9.

This time we find the mode of this data set to be 7. In the following table, we
show the details of the calculation of the mean absolute deviation about the
mode.

Dat Deviation from mode Absolute Value of

a Deviation
1 1 - 7 = -6 |-5| = 6
2 2 - 7 = -5 |-5| = 5
2 2 - 7 = -5 |-5| = 5
3 3 - 7 = -4 |-4| = 4
5 5 - 7 = -2 |-2| = 2
7 7-7=0 |0| = 0
7 7-7=0 |0| = 0
7 7-7=0 |0| = 0
7 7-7=0 |0| = 0
9 9-7=2 |2| = 2
Total of Absolute 22
Deviations:

We divide the sum of the absolute deviations and see that we have a mean
absolute deviation about the mode of 22/10 = 2.2.

Distribution of errors

Error Distribution of Machine Learning

Model

Machine Learning Performance Metrics Explained
No ratings yet
Machine Learning Performance Metrics Explained
15 pages
23AD1401 Machine Learning Unit 5
No ratings yet
23AD1401 Machine Learning Unit 5
44 pages
Machine Learning Performance Metrics Guide
No ratings yet
Machine Learning Performance Metrics Guide
30 pages
Machine Learning Performance Metrics Guide
No ratings yet
Machine Learning Performance Metrics Guide
8 pages
Unit 2 Machine Learning
No ratings yet
Unit 2 Machine Learning
46 pages
Unit 4
No ratings yet
Unit 4
15 pages
Model Evaluation and Performance Metrics
No ratings yet
Model Evaluation and Performance Metrics
15 pages
Classification Model Evaluation Metrics
No ratings yet
Classification Model Evaluation Metrics
8 pages
Performance Metrics for ML Models
No ratings yet
Performance Metrics for ML Models
6 pages
Model Evaluation and Performance Metrics
No ratings yet
Model Evaluation and Performance Metrics
16 pages
Unit 4
No ratings yet
Unit 4
4 pages
DL 1
No ratings yet
DL 1
14 pages
Evaluation Metrics for ML Models
No ratings yet
Evaluation Metrics for ML Models
36 pages
Se CH1
No ratings yet
Se CH1
11 pages
Performance Metrics
No ratings yet
Performance Metrics
6 pages
Key Evaluation Metrics for ML Models
No ratings yet
Key Evaluation Metrics for ML Models
6 pages
14 Classification - Metric
No ratings yet
14 Classification - Metric
10 pages
Evaluating Models with Confusion Matrix & ROC
No ratings yet
Evaluating Models with Confusion Matrix & ROC
10 pages
Performance Metrics in Machine Learning
No ratings yet
Performance Metrics in Machine Learning
19 pages
Binary Classification Performance Metrics
No ratings yet
Binary Classification Performance Metrics
3 pages
Evaluationmeasures ML
No ratings yet
Evaluationmeasures ML
42 pages
Model Evaluation Techniques in ML
No ratings yet
Model Evaluation Techniques in ML
14 pages
Classification Performance Metrics Explained
No ratings yet
Classification Performance Metrics Explained
12 pages
Performance Metrics for Classification
No ratings yet
Performance Metrics for Classification
34 pages
Machine Learning Performance Metrics Guide
No ratings yet
Machine Learning Performance Metrics Guide
24 pages
Key Metrics for Classification Performance
No ratings yet
Key Metrics for Classification Performance
35 pages
Unit V Performance Metrics
No ratings yet
Unit V Performance Metrics
21 pages
Machine Learning Classification Overview
No ratings yet
Machine Learning Classification Overview
20 pages
Machine Learning Performance Metrics Guide
No ratings yet
Machine Learning Performance Metrics Guide
24 pages
Lesson 7 Model Evaluation and Performance Metrics
No ratings yet
Lesson 7 Model Evaluation and Performance Metrics
10 pages
DL 2 Unit 3
No ratings yet
DL 2 Unit 3
22 pages
Accuracy and Evaluation Metrics Explained
No ratings yet
Accuracy and Evaluation Metrics Explained
18 pages
Understanding Binary Classification Metrics
No ratings yet
Understanding Binary Classification Metrics
39 pages
Machine Learning Performance Metrics Guide
No ratings yet
Machine Learning Performance Metrics Guide
38 pages
Evaluation Metrics
No ratings yet
Evaluation Metrics
9 pages
DAP Unit 5 Notes
No ratings yet
DAP Unit 5 Notes
75 pages
Machine Learning Performance Metrics Guide
No ratings yet
Machine Learning Performance Metrics Guide
24 pages
Neural Network Performance Metrics
No ratings yet
Neural Network Performance Metrics
15 pages
Classification Evaluation Metrics Explained
No ratings yet
Classification Evaluation Metrics Explained
22 pages
Evaluation Metrics in Machine Learning
No ratings yet
Evaluation Metrics in Machine Learning
6 pages
Evaluation of Classifier Performance-I
No ratings yet
Evaluation of Classifier Performance-I
4 pages
Unit5 Evaluation Metrics
No ratings yet
Unit5 Evaluation Metrics
9 pages
ML Model Performance Metrics Explained
No ratings yet
ML Model Performance Metrics Explained
23 pages
Machine Learning Evaluation Metrics Guide
No ratings yet
Machine Learning Evaluation Metrics Guide
45 pages
Machine Learning Model Evaluation Techniques
No ratings yet
Machine Learning Model Evaluation Techniques
11 pages
Machine Learning Performance Metrics Guide
No ratings yet
Machine Learning Performance Metrics Guide
50 pages
CS-405 ArtificialNeuralNetworks 10
No ratings yet
CS-405 ArtificialNeuralNetworks 10
14 pages
Evaluation Matrics in ML
No ratings yet
Evaluation Matrics in ML
4 pages
Classification Model Evaluation Metrics
No ratings yet
Classification Model Evaluation Metrics
12 pages
23AD1401 - ML Unit 5 Notes
No ratings yet
23AD1401 - ML Unit 5 Notes
26 pages
Confusion Matrix & Accuracy Metrics Explained
No ratings yet
Confusion Matrix & Accuracy Metrics Explained
6 pages
Machine Learning Evaluation Metrics Guide
No ratings yet
Machine Learning Evaluation Metrics Guide
43 pages
Binary Classification in Machine Learning
No ratings yet
Binary Classification in Machine Learning
22 pages
ML Unit 2 Notes
No ratings yet
ML Unit 2 Notes
49 pages
Confusion Matrix & Evaluation Metrics
No ratings yet
Confusion Matrix & Evaluation Metrics
23 pages
Key Evaluation Metrics for Classifiers
No ratings yet
Key Evaluation Metrics for Classifiers
24 pages
Confusion Matrix and F1 Score Explained
No ratings yet
Confusion Matrix and F1 Score Explained
27 pages
Unit-III (Evaluation of Classification Methods)
No ratings yet
Unit-III (Evaluation of Classification Methods)
9 pages
Key Performance Metrics in ML
No ratings yet
Key Performance Metrics in ML
12 pages
Unit 2
No ratings yet
Unit 2
10 pages
Unit 2
No ratings yet
Unit 2
16 pages
Unit 1
No ratings yet
Unit 1
12 pages
ML Paper Solution
No ratings yet
ML Paper Solution
18 pages
Unit I ML
No ratings yet
Unit I ML
24 pages
UNIT - II Supervised Machine Learning
No ratings yet
UNIT - II Supervised Machine Learning
97 pages
FAIML Notes
No ratings yet
FAIML Notes
171 pages
What Is Artificial Intelligence
No ratings yet
What Is Artificial Intelligence
5 pages
Unit 1
No ratings yet
Unit 1
11 pages
Understanding Criterion Validity Types
No ratings yet
Understanding Criterion Validity Types
9 pages
Statistical Quality Control Overview
No ratings yet
Statistical Quality Control Overview
14 pages
Sample Size Calculation in Animal Studies
No ratings yet
Sample Size Calculation in Animal Studies
4 pages
Understanding Risk Management Basics
No ratings yet
Understanding Risk Management Basics
45 pages
HNN108 Practice Exam for Healthcare Research
No ratings yet
HNN108 Practice Exam for Healthcare Research
3 pages
Crystalline Silicon PV Module Power Model
No ratings yet
Crystalline Silicon PV Module Power Model
15 pages
Machine Learning Lab Manual for Engineers
100% (1)
Machine Learning Lab Manual for Engineers
65 pages
Cash Conversion Cycle and Stock Returns
No ratings yet
Cash Conversion Cycle and Stock Returns
26 pages
Test Bank for Behavioral Sciences STAT
No ratings yet
Test Bank for Behavioral Sciences STAT
14 pages
Types of Deviation in Linear Regression
No ratings yet
Types of Deviation in Linear Regression
3 pages
Understanding Return and Risk in Investing
No ratings yet
Understanding Return and Risk in Investing
3 pages
Analytical Techniques in Pharmaceutical Chemistry
No ratings yet
Analytical Techniques in Pharmaceutical Chemistry
9 pages
Estimating VS30 for Soil and Rock Profiles
No ratings yet
Estimating VS30 for Soil and Rock Profiles
5 pages
ASTM E1155: Floor Flatness & Levelness
No ratings yet
ASTM E1155: Floor Flatness & Levelness
8 pages
Pre-MBA Statistics Overview
No ratings yet
Pre-MBA Statistics Overview
2 pages
Psychology Research Methods Overview
No ratings yet
Psychology Research Methods Overview
9 pages
Local Guide Program in Mining Impact Assessment
No ratings yet
Local Guide Program in Mining Impact Assessment
10 pages
Nigerian Teachers' Views on Reggio Emilia
No ratings yet
Nigerian Teachers' Views on Reggio Emilia
20 pages
Data Preprocessing Techniques in Data Mining
No ratings yet
Data Preprocessing Techniques in Data Mining
47 pages
Profitability Analysis of New Indian Banks
No ratings yet
Profitability Analysis of New Indian Banks
10 pages
Physics in Animation Techniques
No ratings yet
Physics in Animation Techniques
58 pages
Understanding Measurement Types and Methods
No ratings yet
Understanding Measurement Types and Methods
2 pages
JEE-Main Statistics: Measures of Dispersion
No ratings yet
JEE-Main Statistics: Measures of Dispersion
2 pages
Future Gear Metrology Innovations
No ratings yet
Future Gear Metrology Innovations
7 pages
Mean and Variance in Sampling Distributions
No ratings yet
Mean and Variance in Sampling Distributions
26 pages
Fly Ash Geopolymer Concrete Strength Analysis
No ratings yet
Fly Ash Geopolymer Concrete Strength Analysis
11 pages
Understanding Probability in Risk Management
No ratings yet
Understanding Probability in Risk Management
21 pages
Data Analysis in ICT615 Research Methods
No ratings yet
Data Analysis in ICT615 Research Methods
3 pages
Financial Performance Post-M&A in India
No ratings yet
Financial Performance Post-M&A in India
8 pages
Understanding Corruption Indices' Metrics
No ratings yet
Understanding Corruption Indices' Metrics
23 pages

Unit Iv ML

Uploaded by

Unit Iv ML

Uploaded by

UNIT 4

Performance Metrics in Machine Learning

In machine learning, each task or problem is divided

1. Performance Metrics for Classification

It can be formulated as:

To implement an accuracy metric, we can compare ground truth and predicted

Firstly, we need to import the accuracy_score function of the scikit-learn library as

1. from [Link] import accuracy_score

When to Use Accuracy?

When not to use Accuracy?

II. Confusion Matrix

We can determine the following from the above matrix:

IV. Recall or Sensitivity

The formula for calculating Recall is given below:

When to use Precision and Recall?

In simple words, if we maximize precision, it will minimize the FP errors, and if we

The formula for calculating the F1 score is given below:

When to use F-Score?

o True Positive Rate

AUC: Area Under the ROC curve

When to Use AUC

When not to use AUC

2. Performance Metrics for Regression

o Mean Absolute Error

I. Mean Absolute Error (MAE)

The below formula is used to calculate MAE:

II. Mean Squared Error

MSE is a much-preferred metric compared to other regression metrics as it is

The formula for calculating MSE is given below:

III. R Squared Score

IV. Adjusted R Squared

We can calculate the adjusted R squared as follows:

n is the number of observations

k denotes the number of independent variables

and Ra2 denotes the adjusted R2

Mean Absolute Deviation

What is Mean Absolute Deviation?

What is Measure of Spread?

Mean Absolute Deviation Formula

For Ungrouped Data

 μ is the mean of the data set, and

For Grouped Data

Mean Absolute Deviation vs. Standard Deviation

The average distance

1. Calculate the mean of the

Example: Mean Absolute Deviation About the Mean

Data Deviation from mean Absolute Value of

Example: Mean Absolute Deviation About the Mean

Data Deviation from mean Absolute Value of

Example: Mean Absolute Deviation About the Median

Data Deviation from median Absolute Value of

Example: Mean Absolute Deviation About the Median

Dat Deviation from mode Absolute Value of

Error Distribution of Machine Learning

You might also like