0% found this document useful (0 votes)

9 views37 pages

3.2 Class - Dm..

The document discusses the evaluation of classification model performance in machine learning, focusing on performance metrics such as accuracy, precision, recall, and F1 score. It outlines techniques to improve classification accuracy, including data-level improvements, feature engineering, model-level techniques, and evaluation strategy improvements. Additionally, it highlights the importance of proper validation and the challenges associated with evaluation metrics in imbalanced domains.

Uploaded by

ashishvishwakarma0404

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views37 pages

3.2 Class - Dm..

Uploaded by

ashishvishwakarma0404

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Matters of Discussion

Machine Learning Experiments:

Evaluating classification model performance

[Review]
Classification model Performance - Evaluating
Predictive model Performance

Techniques to Improve Classification Accuracy

Compiled By: Dr. Nilamadhab Mishra [(PhD- CSIE) Taiwan]

1
Performance measures
❖The performance of the developed model can
be evaluated using Confusion Matrix
=== Confusion Matrix ===
a b <-- classified as
Predicted
150 28 | a = tested_negative Class
32 51 | b = tested_positive
Class = Class
Negative =Positive
Actual Class = True False
Class Negative
Negative Positive
[a] (FP)
(TN)
Class = False True
Positive Negative Positive
[b] (FN) (TP)
Compiled By: Dr. Nilamadhab Mishra [(PhD- CSIE) Taiwan]
2
Data set for building Confusion Matrix Example

TP FP
FN TN

3
Performance measures
• Performance metrics

4
Compiled By: Dr. Nilamadhab Mishra [(PhD- CSIE) Taiwan]
Classifications - Classification methods

Decision Tree,
Naïve Bayes,
K-Nearest Neighbors

Already Discussed - ok
how to estimate the performance of those algorithms based on the
measures.

Now we investigate the performance measure?

Compiled By: Dr. Nilamadhab Mishra [(PhD- CSIE) Taiwan]
5
Performance measure for Naïve Bayes classification[class wise]

Compiled By: Dr. Nilamadhab Mishra [(PhD- CSIE) Taiwan]

6
Comparison Result

Compiled By: Dr. Nilamadhab Mishra [(PhD- CSIE) Taiwan]

7
Four common Test options
For both, training and testing, you need data.
Those four options are commonly used.
1. Use training set:
➢ Means you will test your knowledge on the
same data you learned.
➢ Not very accepted because you can just make
build your code to memorize the training
instances (which will be in the test).
➢ Less degree of use for research.

Compiled By: Dr. Nilamadhab Mishra [(PhD- CSIE) Taiwan]

8
2. Supplied test set:
❖ It is an external file that you can use as
training set.

❖ It can be used when you want/need to test

the algorithm's knowledge against a specific
test set.

Compiled By: Dr. Nilamadhab Mishra [(PhD- CSIE) Taiwan]

9
3. K-fold cross validation
❖ The training set is randomly divided into K disjoint sets of
equal size where each part has roughly the same class
distribution.
❖ You fold the data in 10 folds (for example) and
repeat 10 (because it is 10-folds) the following
process: Use 9 folds for training and leave 1 fold out
for testing.

Compiled By: Dr. Nilamadhab Mishra [(PhD- CSIE) Taiwan]

10
4. Percentage split:
❖ Splits the data and separates x% of the data
for learning and the rest of it for testing.

❖ It is useful when your algorithm is slow.

❖ The best method to evaluate your classifier is

to train algorithm with 67% of your training
data and 33% to test your classifier.

Compiled By: Dr. Nilamadhab Mishra [(PhD- CSIE) Taiwan]

11
Model performance for classification models
A classification model is a machine learning model
which predicts a Y variable which is categorical:
1. Will the employ leave the organization or stay?
2. Does the patient have cancer or not?
3. Does this customer fall into high risk, medium
risk or low risk?
4. Will the customer pay or default a loan?
A classification model in which the Y variable can
take only 2 values is called a binary classifier.

Compiled By: Dr. Nilamadhab Mishra [(PhD- CSIE) Taiwan]

12
CASE: Confusion matrix for customer class prediction
=== Confusion Matrix ===
a b <-- classified as TN= 150 ; FP = 28
150 28 | a = tested_negative
FN= 32 ; TP = 51
32 51 | b = tested_positive

Compiled By: Dr. Nilamadhab Mishra [(PhD- CSIE) Taiwan]

13
model performance measure
1. Accuracy: = [TP+TN] / [TP+FP+TN+FN]
Accuracy is the number of correct predictions made by
the model by the total number of records. The best
accuracy is 100% indicating that all the predictions are
correct. TN= 150 ; FP = 28
2. Sensitivity or recall FN= 32 ; TP = 51
Sensitivity (Recall or True positive rate) is calculated as
the number of correct positive predictions divided by
the total number of positives.

Compiled By: Dr. Nilamadhab Mishra [(PhD- CSIE) Taiwan]

14
3. Specificity:
Specificity (true negative rate) is calculated as
the number of correct negative predictions
divided by the total number of negatives.
TN= 150 ; FP = 28
4. Precision: FN= 32 ; TP = 51
Precision (Positive predictive value) is
calculated as the number of correct positive
predictions divided by the total number of
positive predictions.

Compiled By: Dr. Nilamadhab Mishra [(PhD- CSIE) Taiwan]

15
5. KS statistic : KS statistic is a measure of degree of
separation between the positive and negative
distributions. KS value of 100 indicates that the scores
partition the records exactly such that one group
contains all positives and the other contains all
negatives. In practical situations, a KS value higher than
50% is desirable.
6. ROC chart & Area under the curve (AUC)
ROC chart is a plot of 1-specificity in the X axis and
sensitivity in the Y axis. Area under the ROC curve is a
measure of model performance. The AUC of a random
classifier is 50% and that of a perfect classifier is 100%.
For practical situations, an AUC of over 70% is
desirable.
Compiled By: Dr. Nilamadhab Mishra [(PhD- CSIE) Taiwan]
16
7. Precision vs. recall: Recall or sensitivity gives
us information about a model’s performance on
false negatives (incorrect prediction of
customers who will default),
while precision gives us information of the
model’s performance of false positives.
8. F-measure [measure of a test's accuracy]
= F1 Score = 2*(Recall * Precision) / (Recall +
Precision) TN= 150 ; FP = 28
FN= 32 ; TP = 51
(F1 score or F score): alternate terms
Compiled By: Dr. Nilamadhab Mishra [(PhD- CSIE) Taiwan]
17
Performance measures
• Performance metrics

18
Compiled By: Dr. Nilamadhab Mishra [(PhD- CSIE) Taiwan]
Performance measures Summary

19
Rules extracted from classification algorithms

The final classification Rules are the actual classifier model use for prediction.
20
Challenge of Evaluation Metrics
1) Evaluation measures play a crucial role in both
assessing the classification performance and
guiding the classifier modeling.
2) In fact, the use of common metrics in imbalanced
domains can lead to sub-optimal classification
models and might produce misleading conclusions
since these measures are insensitive to skewed
domains.
✓ skewness is a measure of the asymmetry of the
probability distribution.

21
Techniques to
Improve Classification Accuracy.
Data-Level Improvements
1.1 Data Cleaning
Handle missing values (imputation, removal)
Remove duplicates
Correct labeling errors
Detect and treat outliers
1.2 Handling Class Imbalance
Oversampling (e.g., SMOTE)
Undersampling
Class weighting
Generate synthetic samples
1.3 Data Augmentation (especially for images/text/audio)
Image rotation, flipping, cropping
Text synonym replacement, paraphrasing
Noise injection in audio
1.4 Increase Dataset Size
Collect more data
Use transfer learning
Use pre-trained embeddings 22
Techniques to
Improve Classification Accuracy.
2. Feature Engineering
2.1 Feature Selection
Remove irrelevant features
Use:
Chi-square test
Information Gain
Recursive Feature Elimination (RFE)
L1 regularization
2.2 Feature Extraction
PCA (Principal Component Analysis)
LDA (Linear Discriminant Analysis)
Autoencoders
2.3 Feature Scaling
Standardization (Z-score normalization)
Min–Max scaling
Robust scaling (for outliers)
2.4 Domain-Specific Features
Use domain knowledge
Create interaction features
23
Techniques to
Improve Classification Accuracy.
3. Model-Level Techniques
3.1 Algorithm Selection
Try multiple models:
Logistic Regression
SVM
k-NN
Decision Trees
Random Forest
Gradient Boosting (XGBoost, LightGBM, CatBoost)
Neural Networks
24
Techniques to
Improve Classification Accuracy.
3.2 Hyperparameter Tuning
Grid Search
Random Search
Bayesian Optimization
Cross-validation tuning
3.3 Ensemble Methods
Bagging
Boosting
Stacking
Voting classifiers
Ensembles often significantly improve accuracy.

25
Techniques to
Improve Classification Accuracy.
4. Regularization and Optimization
4.1 Prevent Overfitting
L1/L2 regularization
Dropout (neural networks)
Early stopping
Pruning (decision trees)
4.2 Better Optimization
Learning rate tuning
Adaptive optimizers (Adam, RMSProp)
Batch normalization
26
Techniques to
Improve Classification Accuracy.
5. Evaluation Strategy Improvements
5.1 Proper Validation
k-fold cross-validation
Stratified sampling
Avoid data leakage
5.2 Better Metrics (when accuracy is misleading)
Precision
Recall
F1-score
ROC-AUC
Confusion matrix analysis

27
Techniques to
Improve Classification Accuracy.
Practical Workflow for Improving Accuracy
i. Clean and preprocess data
ii. Perform exploratory data analysis (EDA)
iii. Engineer meaningful features
iv. Train baseline model
v. Tune hyperparameters
vi. Try ensemble methods
vii. Validate properly
viii. Analyze errors and iterate

28
ACTIVITY-07

SOLVED

In a Covid test of 1000 patients, there were 45 positive tests,

of which 30 patients had covid and 15 were falsely tested
positive.

Of the 955 negative tests there were 5 that were incorrect,

these patients had covid but were tested negatively.

Draw the confusion matrix and calculate the accuracy,

precision, recall, sensitivity, and F1 score from the matrix.

29
ACTIVITY-07—cont..
The total of 1000 cases consist of 45 positive
tests (TP + FP) which are correct (30) and
incorrect (15). The other 955 negative cases (FP
+ FN) contain 5 incorrect tests and 950 correct
tests.

True Positive (TP) a correct positive test – 30

True Negative (TN) a correct negative test – 950
False Positive (FP) an incorrect positive test – 15
False Negative (FN) an incorrect negative test – 5

30
ACTIVITY-07—cont..

31
ACTIVITY-07—cont..
• Accuracy – the percentage of correct
predictions.
• Precision– the percentage of positive, correct
predictions
• Recall– the percentage of actual cases that
the test has correctly identified.
• Sensitivity– the same as recall
• F1 score – a measure that equally combines
both precision and recall

32
ACTIVITY-07—cont..
Accuracy
number of correct predictions / total number of
predictions
30+950 / 30 + 15+ 950 + 5
= 980/1000
= 49/50 or 98%
Precision
true positive / (true positive + false positive)
30 / (30+15)
=30/45
=2/3 or 66.7%

33
ACTIVITY-07—cont..
Recall (and sensitivity)

true positive / (true positive + false negative)

30 / 30+5
=30/35
= 0.857 or 85.7%
F1 score
2 x (precision*recall / precision + recall)
= 2 * (0.57/1.52)
= 2*0.375
=0.75 or 75%

34
ACTIVITY-13
Explore a classification problem case by
considering any real-world domain application,
formulate a confusion matrix through scenario
assumption for the classifier model, and
investigate the various parameters to measure
the performance of the classifier model.

Compiled By: Dr. Nilamadhab Mishra [(PhD- CSIE) Taiwan]

35
Extra Query
1. Investigate the four test options to perform
training and testing for Machine learning
algorithms based on the dataset. What do the
four test options mean, and when do you use
them?
2. Investigate the significant challenges in the
context of the performance measures of the
classifier models that are connected to the real-
world application scenario.

36
Cheers For the Great Patience!
Query Please?

Compiled By: Dr. Nilamadhab Mishra [(PhD- CSIE) Taiwan]

Machine Learning Model Evaluation Metrics
No ratings yet
Machine Learning Model Evaluation Metrics
36 pages
Understanding Classification in ML
No ratings yet
Understanding Classification in ML
38 pages
K-Fold Cross Validation in Python
No ratings yet
K-Fold Cross Validation in Python
11 pages
Machine Learning Classification Overview
No ratings yet
Machine Learning Classification Overview
20 pages
Classification Techniques and Evaluation Metrics
No ratings yet
Classification Techniques and Evaluation Metrics
53 pages
Model Evaluation and Selection Techniques
No ratings yet
Model Evaluation and Selection Techniques
37 pages
Mining Model Evaluation and Selection
No ratings yet
Mining Model Evaluation and Selection
34 pages
Classifier Evaluation Metrics and Methods
No ratings yet
Classifier Evaluation Metrics and Methods
9 pages
Machine Learning Classification Overview
No ratings yet
Machine Learning Classification Overview
20 pages
Machine Learning Classification Overview
No ratings yet
Machine Learning Classification Overview
35 pages
Machine Learning Evaluation Techniques
No ratings yet
Machine Learning Evaluation Techniques
121 pages
Cross Validation Techniques in ML
No ratings yet
Cross Validation Techniques in ML
52 pages
Classification Algorithms and Evaluation Techniques
No ratings yet
Classification Algorithms and Evaluation Techniques
24 pages
Machine Learning Classification Explained
No ratings yet
Machine Learning Classification Explained
6 pages
Selecting Machine Learning Algorithms
No ratings yet
Selecting Machine Learning Algorithms
36 pages
Classifier Evaluation Metrics Guide
No ratings yet
Classifier Evaluation Metrics Guide
17 pages
Evaluating Hearing Screenings Accuracy
No ratings yet
Evaluating Hearing Screenings Accuracy
29 pages
Classifier Performance Evaluation Metrics
No ratings yet
Classifier Performance Evaluation Metrics
38 pages
Module 4
No ratings yet
Module 4
35 pages
DL 2 Unit 3
No ratings yet
DL 2 Unit 3
22 pages
Machine Learning Model Evaluation Metrics
No ratings yet
Machine Learning Model Evaluation Metrics
27 pages
Machine Learning Evaluation Techniques
No ratings yet
Machine Learning Evaluation Techniques
19 pages
Machine Learning Model Evaluation Guide
No ratings yet
Machine Learning Model Evaluation Guide
31 pages
Data Mining - Classification
No ratings yet
Data Mining - Classification
42 pages
Machine Learning Classifier Evaluation Guide
No ratings yet
Machine Learning Classifier Evaluation Guide
61 pages
Machine Learning Model Training & Testing
No ratings yet
Machine Learning Model Training & Testing
23 pages
Understanding Classification Algorithms
No ratings yet
Understanding Classification Algorithms
131 pages
Model Evaluation Metrics in Machine Learning
No ratings yet
Model Evaluation Metrics in Machine Learning
27 pages
Machine Learning Evaluation Metrics Guide
No ratings yet
Machine Learning Evaluation Metrics Guide
17 pages
Importance of Classification in Machine Learning
No ratings yet
Importance of Classification in Machine Learning
99 pages
D3 IT Performance Metrics May 2023
No ratings yet
D3 IT Performance Metrics May 2023
48 pages
Module 5
No ratings yet
Module 5
10 pages
Model Evaluation and Selection Techniques
No ratings yet
Model Evaluation and Selection Techniques
41 pages
Understanding Machine Learning Basics
No ratings yet
Understanding Machine Learning Basics
69 pages
Machine Learning Performance Metrics Guide
No ratings yet
Machine Learning Performance Metrics Guide
50 pages
Machine Learning Model Evaluation Metrics
No ratings yet
Machine Learning Model Evaluation Metrics
40 pages
Measuring Classifiers ML
No ratings yet
Measuring Classifiers ML
3 pages
Accuracy and Error Measures in Data Mining
No ratings yet
Accuracy and Error Measures in Data Mining
16 pages
Understanding Performance Metrics in AI
No ratings yet
Understanding Performance Metrics in AI
46 pages
Ai DS 2 Book-Chpt-5
No ratings yet
Ai DS 2 Book-Chpt-5
17 pages
Lecture20-Evaluation Slides
No ratings yet
Lecture20-Evaluation Slides
59 pages
DM Notes Unit - 4
No ratings yet
DM Notes Unit - 4
11 pages
Classification Techniques and SVM Overview
No ratings yet
Classification Techniques and SVM Overview
127 pages
Understanding Classification in Machine Learning
No ratings yet
Understanding Classification in Machine Learning
61 pages
Machine Learning Model Evaluation Techniques
No ratings yet
Machine Learning Model Evaluation Techniques
11 pages
ML IB - Eval 09
No ratings yet
ML IB - Eval 09
21 pages
Lecture 3. Basic Concept of Supervised Learning and Rule Induction
No ratings yet
Lecture 3. Basic Concept of Supervised Learning and Rule Induction
48 pages
Module 6 - Evaluation Metrics
No ratings yet
Module 6 - Evaluation Metrics
23 pages
Machine Learning Model Evaluation Guide
No ratings yet
Machine Learning Model Evaluation Guide
12 pages
Understanding Classification and Its Uses
No ratings yet
Understanding Classification and Its Uses
100 pages
Data Mining Evaluation Metrics Overview
No ratings yet
Data Mining Evaluation Metrics Overview
40 pages
Model Evaluation Techniques in ML
No ratings yet
Model Evaluation Techniques in ML
20 pages
Predicting Mechanical System Failures
No ratings yet
Predicting Mechanical System Failures
2 pages
Machine Learning Model Evaluation Metrics
No ratings yet
Machine Learning Model Evaluation Metrics
37 pages
Binary Classifier Training and Evaluation
No ratings yet
Binary Classifier Training and Evaluation
151 pages
Classifier Performance Evaluation Guide
No ratings yet
Classifier Performance Evaluation Guide
22 pages
ML Classification and Evaluation Metrics
No ratings yet
ML Classification and Evaluation Metrics
18 pages
Key Machine Learning Metrics Explained
No ratings yet
Key Machine Learning Metrics Explained
39 pages
Binary Classification Techniques Explained
No ratings yet
Binary Classification Techniques Explained
39 pages
Anodic Stray Current Interference Study
No ratings yet
Anodic Stray Current Interference Study
7 pages
Java Overview and Usage Insights
No ratings yet
Java Overview and Usage Insights
17 pages
Hydraulic Steering System Selection Guide
No ratings yet
Hydraulic Steering System Selection Guide
108 pages
DeltaV Modbad Alarm Troubleshooting
No ratings yet
DeltaV Modbad Alarm Troubleshooting
4 pages
Turbine Blade Damage Analysis in Geothermal Plant
No ratings yet
Turbine Blade Damage Analysis in Geothermal Plant
8 pages
Install Timing Plate
100% (1)
Install Timing Plate
6 pages
LSC400HM06 LCD Module Specification
100% (1)
LSC400HM06 LCD Module Specification
37 pages
Fastening Solutions for Lightweight Assembly
No ratings yet
Fastening Solutions for Lightweight Assembly
22 pages
Boyle's Law Laboratory Report
No ratings yet
Boyle's Law Laboratory Report
7 pages
Ambrane P-1310 Power Bank Overview
No ratings yet
Ambrane P-1310 Power Bank Overview
10 pages
OPTIGA™ TPM SLB 9645 Data Sheet
No ratings yet
OPTIGA™ TPM SLB 9645 Data Sheet
20 pages
Effective Rat Blockers for Drainage Systems
No ratings yet
Effective Rat Blockers for Drainage Systems
12 pages
Schematic of a Hydroelectric Power Plant
No ratings yet
Schematic of a Hydroelectric Power Plant
55 pages
Lukas Hydraulic Tools PDF
100% (1)
Lukas Hydraulic Tools PDF
132 pages
AI Project Submission Guidelines
No ratings yet
AI Project Submission Guidelines
7 pages
Compliance Certificate for Sprinkler Hoses
No ratings yet
Compliance Certificate for Sprinkler Hoses
2 pages
Sivasakthi Gas Agencies LPG Invoice Details
No ratings yet
Sivasakthi Gas Agencies LPG Invoice Details
2 pages
Communication Process Lesson Plan
No ratings yet
Communication Process Lesson Plan
11 pages
Digital Publishing in Developing Countries by Octavio Kulesz
No ratings yet
Digital Publishing in Developing Countries by Octavio Kulesz
156 pages
Rocket and Projectile Motion Analysis
No ratings yet
Rocket and Projectile Motion Analysis
6 pages
Sharp H7-FV Service Manual Guide
100% (1)
Sharp H7-FV Service Manual Guide
389 pages
Electrical QA-QC Inspection Checklist
No ratings yet
Electrical QA-QC Inspection Checklist
5 pages
RC Model Suppliers in India Directory
No ratings yet
RC Model Suppliers in India Directory
7 pages
Types of Induction Motor Starters
No ratings yet
Types of Induction Motor Starters
21 pages
SCC2000A Crane Load Chart Guide
No ratings yet
SCC2000A Crane Load Chart Guide
76 pages
AXE117 PICAXE-14M Project Board Guide
No ratings yet
AXE117 PICAXE-14M Project Board Guide
2 pages
Saifur Rahman's Academic Profile
No ratings yet
Saifur Rahman's Academic Profile
70 pages
Six Sigma Yellow Belt Certification Course
0% (1)
Six Sigma Yellow Belt Certification Course
4 pages
Test Strategy Document Overview
No ratings yet
Test Strategy Document Overview
6 pages
Life of Well Operations Overview
No ratings yet
Life of Well Operations Overview
5 pages

3.2 Class - Dm..

Uploaded by

3.2 Class - Dm..

Uploaded by

Matters of Discussion

Machine Learning Experiments:

Techniques to Improve Classification Accuracy

Compiled By: Dr. Nilamadhab Mishra [(PhD- CSIE) Taiwan]

Now we investigate the performance measure?

Compiled By: Dr. Nilamadhab Mishra [(PhD- CSIE) Taiwan]

Compiled By: Dr. Nilamadhab Mishra [(PhD- CSIE) Taiwan]

Compiled By: Dr. Nilamadhab Mishra [(PhD- CSIE) Taiwan]

❖ It can be used when you want/need to test

Compiled By: Dr. Nilamadhab Mishra [(PhD- CSIE) Taiwan]

Compiled By: Dr. Nilamadhab Mishra [(PhD- CSIE) Taiwan]

❖ It is useful when your algorithm is slow.

❖ The best method to evaluate your classifier is

Compiled By: Dr. Nilamadhab Mishra [(PhD- CSIE) Taiwan]

Compiled By: Dr. Nilamadhab Mishra [(PhD- CSIE) Taiwan]

Compiled By: Dr. Nilamadhab Mishra [(PhD- CSIE) Taiwan]

Compiled By: Dr. Nilamadhab Mishra [(PhD- CSIE) Taiwan]

Compiled By: Dr. Nilamadhab Mishra [(PhD- CSIE) Taiwan]

In a Covid test of 1000 patients, there were 45 positive tests,

Of the 955 negative tests there were 5 that were incorrect,

Draw the confusion matrix and calculate the accuracy,

True Positive (TP) a correct positive test – 30

true positive / (true positive + false negative)

Compiled By: Dr. Nilamadhab Mishra [(PhD- CSIE) Taiwan]

Compiled By: Dr. Nilamadhab Mishra [(PhD- CSIE) Taiwan]

You might also like