Decision Tree Classification Overview

Decision Trees are a supervised machine learning algorithm used for classification and regression, structured as a tree with decision nodes and leaf nodes representing outcomes. The ID3 algorithm is a classic method for constructing Decision Trees, utilizing information gain to select the best features for splitting data. Overfitting can occur if the tree becomes too complex, and techniques like pre-pruning and post-pruning are used to mitigate this issue.

Uploaded by

nihar44203

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

28 views28 pages

Decision Tree Classification Overview

Uploaded by

nihar44203

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

Classification : Decision Tree

Decision Trees are a popular supervised machine learning algorithm used for
both classification and regression tasks. They work by learning simple
decision rules inferred from the data features to predict a target label.
• A Decision Tree is structured as a tree where each node represents a
decision based on a feature of the data, and each branch represents an
outcome of that decision.
• The root node is the topmost node, representing the first decision.

• The internal nodes are decision points, splitting based on feature values.

• Leaf nodes are the endpoints representing the final prediction or output.
Classification : Decision Tree
age income student credit_rating buys_computer
<=30 high no fair no
<=30 high no excellent no
31…40 high no fair yes
>40 medium no fair yes
>40 low yes fair yes
>40 low yes excellent no
31…40 low yes excellent yes
<=30 medium no fair no
<=30 low yes fair yes
>40 medium yes fair yes
<=30 medium yes excellent yes
31…40 medium no excellent yes
31…40 high yes fair yes
>40 medium no excellent no
Classification : Decision Tree
Basic algorithm (a greedy algorithm):

• Tree is constructed in a top-down recursive divide-and-conquer manner

• At start, all the training examples are at the root
• Attributes are categorical (if continuous-valued, they are discretized in advance)
• Examples are partitioned recursively based on selected attributes
• Test attributes are selected based on a heuristic or statistical measure (e.g., information
gain)
Classification : Decision Tree
ID3 (Iterative Dichotomiser 3) is a classic algorithm used to construct a
Decision Tree, primarily for classification tasks. It was developed by Ross
Quinlan and serves as one of the earliest algorithms in Decision Tree-based
models.
• D3 aims to create a Decision Tree by selecting the most significant feature
at each step that best separates the data for classification.
• It does this by using information gain (based on entropy) as the criterion to
determine the best feature to split on.

Conditions for stopping partitioning:

• All records have the same target class (pure leaf node).
• No more features to split on (terminate as a leaf node).
• Reaching a specified maximum depth or minimum samples threshold to
prevent overfitting.
Classification : Decision Tree
Attribute Selection Measure: age income student credit_rating buys_computer
Information Gain <=30 high no fair no
<=30 high no excellent no
= set of data instances
31…40 high no fair yes
= number of data instances (14)
>40 medium no fair yes
>40 low yes fair yes
=the probability that an arbitrary tuple in
belongs to class >40 low yes excellent no
31…40 low yes excellent yes
<=30 medium no fair no
Example: class yes is 1 <=30 low yes fair yes
>40 medium yes fair yes
<=30 medium yes excellent yes
31…40 medium no excellent yes
31…40 high yes fair yes
>40 medium no excellent no
Classification : Decision Tree
Expected information (entropy) age income student credit_rating buys_computer
needed to classify a tuple in D: <=30 high no fair no
<=30 high no excellent no
31…40 high no fair yes
For the given data
>40 medium no fair yes
>40 low yes fair yes
>40 low yes excellent no
31…40 low yes excellent yes
<=30 medium no fair no
<=30 low yes fair yes
>40 medium yes fair yes
<=30 medium yes excellent yes
31…40 medium no excellent yes
31…40 high yes fair yes
>40 medium no excellent no
Classification : Decision Tree
Information needed (after using A to age income student credit_rating buys_computer
split D into v partitions) <=30 high no fair no
<=30 high no excellent no
31…40 high no fair yes
>40 medium no fair yes
Example
>40 low yes fair yes
>40 low yes excellent no
means “age <=30” has 5 out of 14
31…40 low yes excellent yes
samples, with 2 yes’es and 3 no’s. <=30 medium no fair no
<=30 low yes fair yes
>40 medium yes fair yes
<=30 medium yes excellent yes
31…40 medium no excellent yes
31…40 high yes fair yes
>40 medium no excellent no
Classification : Decision Tree
Information needed (after using A to age income student credit_rating buys_computer
split D into v partitions) <=30 high no fair no
<=30 high no excellent no
31…40 high no fair yes
>40 medium no fair yes
Example
>40 low yes fair yes
>40 low yes excellent no
means “age <=30” has 5 out of 14
31…40 low yes excellent yes
samples, with 2 yes’es and 3 no’s. <=30 medium no fair no
<=30 low yes fair yes
>40 medium yes fair yes
<=30 medium yes excellent yes
31…40 medium no excellent yes
31…40 high yes fair yes
>40 medium no excellent no
Classification : Decision Tree
Information needed (after using A to age income student credit_rating buys_computer
split D into v partitions) <=30 high no fair no
<=30 high no excellent no
31…40 high no fair yes
>40 medium no fair yes
Example
>40 low yes fair yes
>40 low yes excellent no
means “age <=30” has 5 out of 14
31…40 low yes excellent yes
samples, with 2 yes’es and 3 no’s. <=30 medium no fair no
<=30 low yes fair yes
>40 medium yes fair yes
<=30 medium yes excellent yes
31…40 medium no excellent yes
31…40 high yes fair yes
>40 medium no excellent no
Classification : Decision Tree
Information needed (after using A to age income student credit_rating buys_computer
split D into v partitions) <=30 high no fair no
<=30 high no excellent no
31…40 high no fair yes
>40 medium no fair yes
Similarly
>40 low yes fair yes
>40 low yes excellent no
31…40 low yes excellent yes
<=30 medium no fair no
<=30 low yes fair yes
>40 medium yes fair yes
<=30 medium yes excellent yes
31…40 medium no excellent yes
31…40 high yes fair yes
>40 medium no excellent no
Classification : Decision Tree
Information needed (after using A to age income student credit_rating buys_computer
split D into v partitions) <=30 high no fair no
<=30 high no excellent no
31…40 high no fair yes
>40 medium no fair yes
Similarly
>40 low yes fair yes
>40 low yes excellent no
31…40 low yes excellent yes
<=30 medium no fair no
<=30 low yes fair yes
>40 medium yes fair yes
<=30 medium yes excellent yes
31…40 medium no excellent yes
31…40 high yes fair yes
>40 medium no excellent no
Classification : Decision Tree
Information needed (after using A to age income student credit_rating buys_computer
split D into v partitions) <=30 high no fair no
<=30 high no excellent no
31…40 high no fair yes
>40 medium no fair yes
Similarly
>40 low yes fair yes
>40 low yes excellent no
31…40 low yes excellent yes
<=30 medium no fair no
<=30 low yes fair yes
>40 medium yes fair yes
<=30 medium yes excellent yes
31…40 medium no excellent yes
31…40 high yes fair yes
>40 medium no excellent no
Classification : Decision Tree
Information gained by branching on age income student credit_rating buys_computer
attribute A <=30 high no fair no
<=30 high no excellent no
31…40 high no fair yes
>40 medium no fair yes
>40 low yes fair yes
>40 low yes excellent no
31…40 low yes excellent yes
<=30 medium no fair no
<=30 low yes fair yes
Similarly, >40 medium yes fair yes
<=30 medium yes excellent yes
31…40 medium no excellent yes
31…40 high yes fair yes
>40 medium no excellent no
Classification : Decision Tree
Information gained by branching on age income student credit_rating buys_computer
attribute A <=30 high no fair no
<=30 high no excellent no
31…40 high no fair yes
>40 medium no fair yes
>40 low yes fair yes
>40 low yes excellent no
31…40 low yes excellent yes
<=30 medium no fair no
<=30 low yes fair yes
Similarly, >40 medium yes fair yes
<=30 medium yes excellent yes
31…40 medium no excellent yes
31…40 high yes fair yes
>40 medium no excellent no
Classification : Decision Tree
Next, repeat the same
So, the first split is on the age attribute process with the other
nodes with smaller dataset.
Classification : Decision Tree
Next split is on student at node <=30

All are yes

All are no So, no further split here
So, no further split here
Classification : Decision Tree
No split is required at node 31..40

All are yes

So, no further split here
Classification : Decision Tree
Classification : Decision Tree
Next split is on credit_rating at node >40
Classification : Decision Tree
Classification : Decision Tree
Classify a new instance
(age: <=30, income: low, student: yes, credit_rating: fair) = buys_computer?
Classification : Decision Tree
Exercise
Classification : Decision Tree

[Link]
1YsHFA55DZ5iQBNq7S11qoS-_WVYtxUYH?usp=sharing
Classification : Decision Tree
Computing Information-Gain for Continuous-Valued Attributes
• Let attribute A be a continuous-valued attribute
• Must determine the best split point for A
• Sort the value A in increasing order
• Typically, the midpoint between each pair of adjacent values is
considered as a possible split point
• is the midpoint between the values of and
• The point with the minimum expected information requirement
for A is selected as the split-point for A
• Split:
• D1 is the set of tuples in D satisfying A ≤ split-point, and D2 is the set
of tuples in D satisfying A > split-point
Classification : Decision Tree
Gain Ratio for Attribute Selection (C4.5)
• Information gain measure is biased towards attributes with many values
• C4.5 (a successor of ID3) uses gain ratio to overcome the problem
(normalization to information gain) age income student credit_rating buys_computer
<=30 high no fair no
• GainRatio(A) = Gain(A)/SplitInfo(A) <=30 high no excellent no
31…40 high no fair yes
v | Dj | | Dj |
SplitInfo A ( D )  
>40 medium no fair yes
log 2 ( ) >40 low yes fair yes
j 1 |D| |D| >40 low yes excellent no
31…40 low yes excellent yes
<=30 medium no fair no
<=30 low yes fair yes
>40 medium yes fair yes
<=30 medium yes excellent yes
• gain_ratio(income) = 0.029/1.557 = 0.019 31…40
31…40
medium
high
no
yes
excellent
fair
yes
yes
>40 medium no excellent no
Classification : Decision Tree
Gini Index (CART-Classification and Regression Trees)
Information Gain and Gain Ratio can sometimes exhibit a bias toward
features with many unique values. For example, if a feature has many unique
categories, Information Gain might prefer it because it creates more distinct
subsets.
To address this, Gain Ratio (an adaptation of Information Gain) is used in
algorithms like C4.5 to counter this bias, but this additional step adds
complexity.
Gini Index, on the other hand, is less sensitive to the number of categories in a
feature, reducing the need for an additional adjustment like Gain Ratio.
Classification : Decision Tree
Overfitting in Decision Trees
• Overfitting happens when a Decision Tree model becomes too complex,
fitting too closely to the noise and specific details of the training data,
resulting in poor generalization to new data.
• Signs of Overfitting: A decision tree that overfits will often have a high
accuracy on the training set but performs poorly on the test set.
• Causes of Overfitting:
• Allowing the tree to grow too deep, with many nodes and branches.
• Having very specific branches (splits) that capture the peculiarities of the
training data rather than general patterns.
• Consequences: Overfitting leads to a lack of model generalization, where
the model captures random fluctuations instead of meaningful trends.
Classification : Decision Tree
Pre-Pruning (Early Stopping):
This approach stops the tree from growing once it meets certain conditions,
such as reaching a maximum depth, minimum number of samples per leaf, or
minimum information gain threshold. This prevents the tree from developing
complex branches that might lead to

Post-Pruning (Cost Complexity Pruning):

This technique first grows the tree to its full depth, then prunes back branches
that provide minimal value. Post-pruning can be based on measures like cross-
validation to determine the optimal structure that minimizes errors on new
data.

Classification Techniques in Machine Learning
No ratings yet
Classification Techniques in Machine Learning
41 pages
Understanding Decision Trees for Classification
No ratings yet
Understanding Decision Trees for Classification
20 pages
Understanding Classification Techniques
No ratings yet
Understanding Classification Techniques
75 pages
Google Sheets Decision Tree Guide
No ratings yet
Google Sheets Decision Tree Guide
20 pages
Decision Tree Classifier Overview
No ratings yet
Decision Tree Classifier Overview
24 pages
Classification Techniques Overview
No ratings yet
Classification Techniques Overview
81 pages
Unit 4 DWM
No ratings yet
Unit 4 DWM
31 pages
Pruning Decision Trees in Python
No ratings yet
Pruning Decision Trees in Python
16 pages
Decision Tree Induction in Data Mining
No ratings yet
Decision Tree Induction in Data Mining
8 pages
Introduction to Machine Learning Concepts
No ratings yet
Introduction to Machine Learning Concepts
46 pages
Decision Tree Algorithm Overview
No ratings yet
Decision Tree Algorithm Overview
28 pages
Understanding Data Classification Techniques
No ratings yet
Understanding Data Classification Techniques
59 pages
Decision Trees in AI and ML
No ratings yet
Decision Trees in AI and ML
18 pages
Decision Tree Classification Explained
No ratings yet
Decision Tree Classification Explained
7 pages
Unit 3
No ratings yet
Unit 3
80 pages
Decision Trees in Machine Learning
No ratings yet
Decision Trees in Machine Learning
65 pages
Decision Tree Learning Explained
No ratings yet
Decision Tree Learning Explained
78 pages
Module3 - Classification - Google Slides
No ratings yet
Module3 - Classification - Google Slides
42 pages
Decision Tree Classification Overview
No ratings yet
Decision Tree Classification Overview
61 pages
Decision Trees: Understanding Data Relationships
No ratings yet
Decision Trees: Understanding Data Relationships
20 pages
Decision Tree Induction Overview
No ratings yet
Decision Tree Induction Overview
74 pages
Measuring Node Impurity in Decision Trees
No ratings yet
Measuring Node Impurity in Decision Trees
26 pages
Understanding Decision Trees in ML
No ratings yet
Understanding Decision Trees in ML
8 pages
Understanding Decision Tree Algorithms
No ratings yet
Understanding Decision Tree Algorithms
5 pages
Supervised Learning DT
No ratings yet
Supervised Learning DT
19 pages
Decision Tree Algorithm Overview
No ratings yet
Decision Tree Algorithm Overview
17 pages
Decision Tree Classification Overview
No ratings yet
Decision Tree Classification Overview
11 pages
Decision Tree Classification Overview
No ratings yet
Decision Tree Classification Overview
46 pages
Supervised Learning Overview at MKAU
No ratings yet
Supervised Learning Overview at MKAU
129 pages
Decision Tree Algorithm Overview
No ratings yet
Decision Tree Algorithm Overview
63 pages
Data Mining: Classification Techniques
No ratings yet
Data Mining: Classification Techniques
33 pages
Decision
No ratings yet
Decision
9 pages
Understanding Decision Tree Classifier
No ratings yet
Understanding Decision Tree Classifier
12 pages
Decision Tree Induction Techniques
No ratings yet
Decision Tree Induction Techniques
22 pages
Module3 DecisionTree
No ratings yet
Module3 DecisionTree
59 pages
Understanding Decision Trees in ML
No ratings yet
Understanding Decision Trees in ML
11 pages
Decision Tree Classification Basics
No ratings yet
Decision Tree Classification Basics
42 pages
Decision Tree Learning Overview and Algorithms
No ratings yet
Decision Tree Learning Overview and Algorithms
35 pages
FML Unit4
No ratings yet
FML Unit4
23 pages
Decision Tree Learning Overview
No ratings yet
Decision Tree Learning Overview
22 pages
Decision Trees: Gini & Information Gain
No ratings yet
Decision Trees: Gini & Information Gain
12 pages
Decision Tree Induction Overview
No ratings yet
Decision Tree Induction Overview
25 pages
Lect9 Decision Tree
No ratings yet
Lect9 Decision Tree
39 pages
Decision Trees and Ensemble Learning in ML
No ratings yet
Decision Trees and Ensemble Learning in ML
22 pages
Decision Tree Classifier Overview
No ratings yet
Decision Tree Classifier Overview
36 pages
Decision Tree Basics and Algorithms
No ratings yet
Decision Tree Basics and Algorithms
117 pages
Understanding Decision Tree Learning
No ratings yet
Understanding Decision Tree Learning
21 pages
Decision Trees in Machine Learning
No ratings yet
Decision Trees in Machine Learning
65 pages
Decision Tree Learning Overview
No ratings yet
Decision Tree Learning Overview
6 pages
Understanding Decision Trees in ML
No ratings yet
Understanding Decision Trees in ML
55 pages
Decision Trees and ID3 Algorithm Overview
No ratings yet
Decision Trees and ID3 Algorithm Overview
15 pages
Understanding Decision Tree Nodes
No ratings yet
Understanding Decision Tree Nodes
18 pages
Understanding Decision Trees in ML
No ratings yet
Understanding Decision Trees in ML
38 pages
Decision Tree Learning Explained
No ratings yet
Decision Tree Learning Explained
5 pages
Decision Trees in Machine Learning
No ratings yet
Decision Trees in Machine Learning
21 pages
Decision Tree
No ratings yet
Decision Tree
63 pages
Decision Tree Classification Methods
No ratings yet
Decision Tree Classification Methods
158 pages
Understanding Decision Trees in ML
No ratings yet
Understanding Decision Trees in ML
31 pages
Understanding Decision Trees in ML
No ratings yet
Understanding Decision Trees in ML
40 pages
Understanding Noise and Bit Error Rate
No ratings yet
Understanding Noise and Bit Error Rate
16 pages
AI-Driven Optimization in Railway Scheduling
No ratings yet
AI-Driven Optimization in Railway Scheduling
10 pages
Data Analysis and Visualization Syllabus
No ratings yet
Data Analysis and Visualization Syllabus
2 pages
Normal Forms for Context-Free Grammars
No ratings yet
Normal Forms for Context-Free Grammars
43 pages
Control Systems: Differential Equations Solutions
No ratings yet
Control Systems: Differential Equations Solutions
7 pages
Review of Applications of TLBO Algorithm and A Tutorial For Beginners To Solve The Unconstrained and Constrained Optimization Problems
No ratings yet
Review of Applications of TLBO Algorithm and A Tutorial For Beginners To Solve The Unconstrained and Constrained Optimization Problems
31 pages
EECP-CBL Model for Energy Prediction
No ratings yet
EECP-CBL Model for Energy Prediction
12 pages
Research Essay Questions
No ratings yet
Research Essay Questions
2 pages
Data Analysis Python Assignment
No ratings yet
Data Analysis Python Assignment
4 pages
Understanding Dummy Variable Trap
No ratings yet
Understanding Dummy Variable Trap
32 pages
MTH 312 Advanced Calculus Exam Guide
No ratings yet
MTH 312 Advanced Calculus Exam Guide
2 pages
Subband Coding Techniques Explained
No ratings yet
Subband Coding Techniques Explained
11 pages
Quantum Field Theory: A Modern Course
0% (1)
Quantum Field Theory: A Modern Course
27 pages
Stochastic Control and Communication Course
No ratings yet
Stochastic Control and Communication Course
4 pages
Student Performance Study Project 2025
No ratings yet
Student Performance Study Project 2025
2 pages
Credit Card Fraud Detection Model
No ratings yet
Credit Card Fraud Detection Model
17 pages
Knapsack Problem and Algorithms Explained
No ratings yet
Knapsack Problem and Algorithms Explained
39 pages
MTBUR & MTBF: Misleading Reliability Metrics
0% (1)
MTBUR & MTBF: Misleading Reliability Metrics
15 pages
Expansion Contraction Indicator (ECI)
No ratings yet
Expansion Contraction Indicator (ECI)
4 pages
Introduction to Calculus of Variations
No ratings yet
Introduction to Calculus of Variations
31 pages
Natural Language Processing Course Outline
No ratings yet
Natural Language Processing Course Outline
8 pages
Survey of Diffusion Language Models
No ratings yet
Survey of Diffusion Language Models
23 pages
Advanced Management Accounting Mock Exam
No ratings yet
Advanced Management Accounting Mock Exam
10 pages
DSA - Assignment-1 Questions
No ratings yet
DSA - Assignment-1 Questions
2 pages
Information Security Exam Questions Guide
No ratings yet
Information Security Exam Questions Guide
4 pages
Big Data Analytics Exam Questions 2023
No ratings yet
Big Data Analytics Exam Questions 2023
4 pages
Polynomial Regression in ML Pipeline
No ratings yet
Polynomial Regression in ML Pipeline
58 pages
Efficient Algorithms for Nearly Pentadiagonal Systems
No ratings yet
Efficient Algorithms for Nearly Pentadiagonal Systems
14 pages
Understanding Nonlinear Activation Functions
No ratings yet
Understanding Nonlinear Activation Functions
41 pages
Substitution & Transposition Ciphers Explained
No ratings yet
Substitution & Transposition Ciphers Explained
20 pages