0% found this document useful (0 votes)
25 views74 pages

Model Evaluation Techniques in AI

The document discusses the importance of model evaluation in AI, emphasizing techniques like train-test splits to assess model performance and prevent overfitting. It outlines various evaluation metrics such as accuracy, precision, and recall, and their appropriate use cases, particularly in unbalanced datasets. Additionally, it highlights ethical concerns in model evaluation and provides insights into AI modeling approaches and deep learning categories.

Uploaded by

emmanuelchess102
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views74 pages

Model Evaluation Techniques in AI

The document discusses the importance of model evaluation in AI, emphasizing techniques like train-test splits to assess model performance and prevent overfitting. It outlines various evaluation metrics such as accuracy, precision, and recall, and their appropriate use cases, particularly in unbalanced datasets. Additionally, it highlights ethical concerns in model evaluation and provides insights into AI modeling approaches and deep learning categories.

Uploaded by

emmanuelchess102
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

LESSON-3 EVALUATING MODELS

Need of model evaluation

 Model evaluation is like giving your AI model a report card. It helps you
understand its strengths, weaknesses, and suitability for the task at hand.
 This feedback loop is essential for building trustworthy and reliable AI
systems.
 After understanding the need for Model Evaluation, let’s know how to
begin with the process.
 There can be different Evaluation techniques, depending of the type and
purpose of the model.
Splitting the training
set data for
Evaluation
 Train-test split
Need of Train-test split
Need of Train-test split-EXPLANATION
 Dataset Collection
 You start with your dataset (all the data you have).
 Example: A dataset of 1,000 pictures of dogs with labels like “German Shepherd” or “Labrador.”
 Splitting the Data
 The dataset is divided into:
 Training Set (usually 70–80% of data) → used to train/teach the model.
 Test Set (20–30% of data) → used to evaluate how well the model performs on new data.
This ensures that the model is tested on data it has never seen before.
 Model Training (Fit Model)
 The algorithm (e.g., Decision Tree, Neural Network, etc.) looks at the training data and finds patterns.
 Example: The model learns that German Shepherds usually have upright ears, while Labradors don’t.
 Model Evaluation
 Once trained, we feed the test data (inputs only, without labels) to the model.
 The model predicts the outputs, and then we compare predictions with actual labels.
 Example: If the test image is of a Labrador, does the model also predict “Labrador”?
 Performance Score
 We calculate a score (accuracy, precision, recall, F1-score, etc.) to measure how well the model performed.
 In your example, the evaluation score is 0.67 (67% accuracy).
OVERFITTING

Why Not Train on the Whole Dataset?


 If you train and test on the same dataset, the model could just memorize all
answers (like a student memorizing exam answers instead of learning
concepts).
 It will give perfect results on training data but fail badly on new data → this
is overfitting.
Accuracy and Error
Accuracy is an evaluation metric that allows you to measure the total number of predictions a
model gets right.
The accuracy of the model and performance of the model is directly proportional, and hence
better the performance of the model, the more accurate are the predictions.

Error can be described as an action that is inaccurate or wrong.


In Machine Learning, the error is used to see how accurately our model can predict data it
uses to learn new, unseen data.
Based on our error, we choose the machine learning model which performs best for a
particular dataset.
Error refers to the difference between a model's prediction and the actual outcome. It
quantifies how often the model makes mistakes.
Activity 1: Find the accuracy of the AI
model
Calculate the accuracy of the House Price prediction AI model
 Read the instructions and fill in the blank cells in the table.
 The formula for finding error and accuracy is shown in the table
 Accuracy of the AI model is the mean accuracy of all five samples
 Percentage accuracy can be seen by multiplying the accuracy with 100
Evaluation metrics for Classification
Evaluation metrics for Classification
Classification Metrics

Popular metrics used for classification model


 Confusion matrix
 Classification accuracy
 Precision
 Recall
 Popular metrics used for classification model
Confusion
matrix
TRUE POSITIVE
TRUE NEGATIVE
False Positive
False Negative
Activity 2: Build the confusion matrix
from scratch
Activity Guidelines
 Let’s assume we were predicting the presence of a disease; for example,
"yes" would mean

 they have the disease, and "no" would mean they don't have the disease
 So, the AI model will have output is Yes or No
 The following chart shows the actual values and the predicted values
Construct a confusion matrix.
 Can you tell how many are correct predictions among all predictions?
Accuracy from Confusion matrix

Classification accuracy is the number of correct predictions made as a ratio of


all predictions made.
Calculate the Classification accuracy
from this confusion matrix.
Can we use Accuracy all the time?

 It is only suitable when there are an equal number of observations in each


class, i.e., a balanced dataset (which is rarely the case), and that all
predictions and prediction errors are equally important, which is often not
the case.
Activity 3: Calculate the accuracy of
the classifier model
Activity Guidelines
 Let’s assume you are testing your model on 1000 total test data.
 Out of which the actual values are 900 Yes and only 100 No (Unbalanced
dataset).
 Let’s assume that you have built a faulty model which, irrespective of any
input, will give a prediction as Yes.
 Can you tell the classification accuracy of this model?
Consider ‘Yes ‘as the positive class and ‘No ‘as the negative class.
Consider ‘Yes ‘ as the positive class and ‘No ‘ as the negative class.
Construct the confusion matrix from the Actual vs Predicted table.
Activity solution: Accuracy from
Confusion matrix
So, the faulty model you made is showing an accuracy of 90%. Does this make sense?
So, in cases of unbalanced data, we should use other metrics such as Precision, Recall or F1 score.
Let’s understand them one by one…
Precision from Confusion matrix

 Precision is the ratio of the total number of correctly classified


positive examples and the total number of predicted positive
examples.
 Precision = 0.843 means that when our model predicts a patient has
heart disease, it is correct around 84% of the time.
Precision: where should we use it?
The metrics Precision is generally used for unbalanced datasets when
dealing with the False Positives become important, and the model
needs to reduce the FPs as much as possible.
Precision use case example :

 For example, take the case of predicting a good day based on weather conditions to
launch satellite.
 Let’s assume a day with favorable weather condition is considered Positive class and a
day with non-favorable weather condition is considered as Negative class.
 Missing out on predicting a good weather day is okay (low recall)
 but predicting the bad weather day (Negative class) as a good weather
 day (Positive class) to launch the satellite can be disastrous.
 So, in this case, the FPs need to be reduced as much as possible.
Recall: Where we should we use it?
The metrics Recall is generally used for unbalanced
dataset when dealing with the False Negatives
become important and the model needs to reduce the
FNs as much as possible.
Recall use case example
 For example, for a covid-19 prediction classifier, let’s consider detection of a
covid-19 affected case as positive class and detection of covid-19 non-affected
case as negative class.
 Imagine if a covid-19 affected person (Positive) is falsely predicted as non-
affected of Covid-19 (Negative), the person if rely solely on the AI would not get
any treatment and also may end up infecting many other persons.
 So, in this case, the FNs needs to be reduced as much as possible.
 Hence, Precision is a go-to metrics for this kind of use case.
Activity 4: Decide the appropriate
metric to evaluate the AI model
Scenario: Flagging fraudulent transactions
 You have designed a model to detect any fraudulent transactions with credit
card.
 You are testing your model with highly unbalanced dataset.
 What is the metric to be considered in this case?
 It is okay to classify a legit transaction as fraudulent — it can always be re-
verified by passing through additional checks.
 But it is definitely not okay to classify a fraudulent transaction as legit (false
negative).
 So here false negatives should be reduced as much as possible.
 Hence in this case, Recall is more important.
 For the given data, construct the confusion matrix.
 Calculate the recall from the confusion matrix.
Fill the matrix based on the table given
above
Activity solution: Decide the appropriate metric
to evaluate the AI model

Calculate the recall from the confusion matrix based on.


Ethical concerns around model
evaluation
While evaluating an AI model, the following ethical concerns need to be kept
in mind
UNIT-2
MODELLING IN AI
EXAMPLE: Object Classification, Anomaly Detection
DEEP LEARNING- Deep Learning, or DL, enables
software to train itself to perform tasks with vast
amounts of data. Example: Object identification,
Digital Recognition
Data- information in any form

Feature-column of table

Labels-attaching meaning to a data

COMMON
TERMINOLOGIES Labelled data- Data to which some tag/label is
attached
USED WITH DATA
Training Data set- The training data set is a collection of
examples given to the model to analyze and learn.

Testing Data set-The testing data set is used to test the


accuracy of the model.
MODELLING
AI Modelling refers to developing algorithms, also called models which can be trained to
get intelligent outputs. That is, writing codes to make a machine artificially intelligent.
 1. Rule-Based Approach
Definition : A rule-based approach is a system that uses
predefined rules (IF-THEN statements) set by humans to make
decisions or solve problems.
Example : A spam filter that blocks emails IF they contain certain
words like "free money" or "click here" — THEN mark it as spam.
 2. Learning-Based Approach
Definition: A learning-based approach uses data to train a model
so that it can learn patterns and make decisions without being
explicitly programmed with rules.
Example: A recommendation system on YouTube that suggests
videos based on what you have watched before. The system
learns your preferences from data.
CATEGORIES OF LEARNING BASED APPROACH
Type of
Definition Data Used Goal Example
Learning
Email spam
Model learns Predict output
Supervised detection
from labeled Labeled data for new/unseen
Learning data data
(spam/not
spam)
Model learns Customer
Unsupervised patterns or Discover grouping
groupings from Unlabeled data hidden patterns based on
Learning unlabeled or clusters purchase
data. behavior
Model learns by
Reinforceme trial and error Experience Take actions to Robot learning
from maximize total to walk or play
nt Learning using rewards
environment reward a video game
and penalties.
Sub-Categories of Supervised Learning

Sub-Category Definition Output Type Example


Email
Predicts a classification:
category or Categorical spam or not
Classification
label from the (discrete) spam
data. Fruit: apple or
banana
Predicts a
Predicting
numerical
Continuous house prices
Regression value based
(numeric) based on size
on the input
and location
data.
Sub-Categories of Un Supervised Learning

Sub-Category Definition Goal Example


Groups data
points into
Discover natural Grouping
clusters based
Clustering groupings in customers by
on similarity
data shopping habits
without any
labels.
Finds Market Basket
relationships or Analysis: If a
Identify if-then
Association patterns among customer buys
relationships
items in large bread, they also
datasets. buy butter
DIFFERENCE BETWEEN CLASSIFICATION AND
CLUSTERING
Feature Classification Clustering
Unsupervised
Type of Learning Supervised Learning
Learning
Uses labeled data Works with unlabeled
Labeled Data (with predefined data (no predefined
categories) categories)
Group data based
Assign data to known
Goal on similarity without
categories
predefined labels
Classifying emails as Grouping customers
Example “spam” or “not by purchasing
spam” behavior
CATEGORIES OF DEEP LEARNING
Deep Learning: Deep Learning enables software to train itself to
perform tasks with vast amounts of data. In deep learning, the
machine is trained with huge amounts of data which helps it to train
itself around the data. Such machines are intelligent enough to
develop algorithms for themselves.

 Artificial Neural networks (ANN) - Artificial Neural networks are


modelled on the human brain and nervous system. They are able
to automatically extract features without input from the
programmer. Every neural network node is essentially a machine
learning algorithm. It is useful when solving problems for which the
data set is very large.
 Convolutional Neural Network (CNN) - Convolutional Neural
Network is a Deep Learning algorithm which can take in an input
image, assign importance (learnable weights and biases) to
various aspects/objects in the image and be able to differentiate
one from the other
Neural Network
 A neural network is a machine learning model inspired by the human brain. It consists of
layers of nodes (neurons) — an input layer, one or more hidden layers, and an output
layer.
 The input layer receives raw data.
 Hidden layers process data using weights, biases, and activation functions to extract
features.
 The output layer delivers the final result.
 Neural networks learn by adjusting weights through trial and error (training), gradually
improving performance.
 They are especially useful for large datasets, like in image recognition, chatbots, and price
prediction.
How does AI makes decision
 AI models like neural networks make decisions by processing input data, assigning
weights to each input, summing the results, and passing the result through an
activation function to produce an output. This mimics how humans make decisions by
considering different factors, weighing their importance, and coming to a conclusion.
Perceptron Working:
Takes inputs (e.g., weather conditions).
Multiplies each input by a weight (importance).
Adds a bias.
Passes the result through an activation function (like a decision threshold).
Outputs either 0 or 1 (e.g., No or Yes).
UNIT-1 REVISITING AI PROJECT CYCLE
Problem Data Data
Modelling Evaluation Deployment
Scoping Acquisition Exploration

AI PROJECT LIFE CYCLE


To function effectively, the system uses a variety of data:
• Sensor data:
• Cameras (images/video for lane detection, traffic signs)
• LiDAR & Radar (depth and object detection)
• Ultrasonic sensors (for close-range obstacles)
• GPS (positioning)
• High-definition maps:
• Road layouts, traffic lights, speed limits
• IMU (Inertial Measurement Unit):
• For precise movement tracking (acceleration, orientation)
• Vehicle data:
• Speed, steering angle, braking status
• Historical & real-time traffic data:
• For prediction and planning
• Machine learning models:
• Trained on driving behavior, road conditions, object recognition, etc.
Statistical Data Computer Vision Natural Language Processing
EXAMPLE: Price Comparison Example : Agricultural Example : Email filters, Machine
Websites Monitoring, Surveillance system Translation (Google Translate)

AI DOMAINS
Ethical Frameworks for AI
FRAMEWORKS:
Set of steps that help us in solving a problem
It provides step-by-step guide for solving problem

ETHICS: set of values or morals which help us separate right from


wrong.

ETHICAL FRAMEWORKS: Ethical frameworks are frameworks which


help us ensure that the choices we make do not cause unintended
harm.
By utilizing ethical frameworks, individuals and organizations can
make well- informed decisions that align with their values and
promote positive outcomes for all stakeholders involved.
Religion Explanation: Religion often provides a moral
framework or a set of beliefs that guide our decisions.
Example: Before making a choice, a person may ask, "Is
3 factors this aligned with my religious views?“

knowingly or Intuition & Values Explanation: Sometimes, we rely on


unknowingly our gut feelings or personal values when making
decisions, especially in uncertain [Link]:
influence You might think, "Does this feel right?" or "Does this align
with my sense of right and wrong?“
decision
making Value of humans and non-humansExplanation: Our
decisions are shaped by how much we value human
life, animal welfare, or the [Link]: A
person might avoid using plastic because they value
environmental sustainability.
Sector-based ethics: Focuses on
TYPES OF specific sector or industries , such
ETHICAL as bioethics (healthcare decisions)
FRAMEWORKS

Value-based ethics: Evaluates


actions based on principles,
including utility (maximizing good),
and virtue (aligning with personal
beliefs).
VALUE BASED
•Rights based: Prioritizes
human life and fundamental
rights over other
considerations.
•Utility based: Focuses on
choosing actions that
create more good than
harm.
•Virtue based: Ensures
actions align with personal
beliefs and moral character
such as honesty
compassion and integrity
BIOETHICS
Bioethics ensures ethical decision-making in healthcare and life
sciences, especially for AI applications. It revolves around four core
principles:
 Respect for Autonomy: Patients have the right to make informed
decisions about their own healthcare.
 Do Not Harm: Medical practices and AI systems should avoid
causing harm.
 Ensure Maximum Benefit: Actions should maximize positive outcomes
for individuals and society.
 Give Justice: Healthcare resources and decisions should be fair and
equitable.
Non-maleficence: Avoid causing
harm and minimize negative
consequences to individuals,
communities, or the environment.

Ethical
Principles
Maleficence: The intentional act
Guide of causing harm or wrongdoing.
Responsible
Decision-
making Beneficence: Actively promoting
well-being by taking actions that
maximize positive outcomes for
individuals and society.

You might also like