ML Unit 6

Ensemble learning is a machine learning technique that combines predictions from multiple models to improve accuracy and stability, leveraging the 'wisdom of the crowd' concept. It includes methods like bagging, boosting, and stacking, each with unique approaches to reduce bias and variance, enhance robustness, and handle model diversity. Ensemble methods are crucial in modern machine learning for their ability to improve predictive performance and mitigate risks in high-stakes applications.

Uploaded by

raghav.dubey.606

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views8 pages

ML Unit 6

Uploaded by

raghav.dubey.606

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

UNIT-6

Ensemble Learning

Study Guide

Dr. Vinod Patidar

Associate Professor
CSE Department, PIT
Parul University
Introduction to Ensemble Learning-
Ensemble learning is a machine learning technique where, instead of relying on a single model, you
combine the predictions from multiple models to arrive at a final, more accurate, and more stable
prediction.
The core principle is rooted in the idea of the "wisdom of the crowd." A single expert might be
wrong, but the collective answer from a large, diverse group of experts is likely to be closer to the
truth. In machine learning, these "experts" are individual models, often called "weak learners." A
weak learner is a model that performs only slightly better than random guessing (e.g., a simple
decision tree with only one or two splits).
By combining hundreds or thousands of these weak learners in a sophisticated way, they can form
an incredibly powerful "strong learner." This approach is effective because different models make
different types of errors; by combining their outputs, the individual errors tend to cancel each other
out. This enhances the model's ability to generalize—to make accurate predictions on new, unseen
data—and makes the final model more robust against noise.
This technique is a versatile framework that can be applied to:
● Classification: Predicting a category (e.g., spam or not spam).
● Regression: Predicting a continuous value (e.g., house price).
● Anomaly Detection: Identifying unusual data points (e.g., fraudulent transactions).

Why Ensemble Learning is Important

Ensemble methods are a cornerstone of modern machine learning for several compelling reasons:
● Improved Predictive Performance: This is the primary benefit. Ensembles are specifically
designed to reduce the two main sources of error in machine learning:
o Bias: Error from overly simplistic assumptions (underfitting).
o Variance: Error from sensitivity to small fluctuations in the training data (overfitting).
● Reduction of Overfitting & Robustness: A single complex model can easily overfit by learning
the noise in the training data. Ensembles combat this by averaging out the predictions of many
models. This smooths out individual peculiarities, making the overall model more stable, reliable,
and less sensitive to outliers.
● Handling Model Diversity (Versatility): No single algorithm is best for every problem.
Ensembles allow you to combine fundamentally different types of models (e.g., a decision tree, a
support vector machine, and a neural network) in a process called stacking. This leverages the
unique strengths of each algorithm.
● Risk Mitigation: In high-stakes fields like medical diagnosis or financial trading, relying on a
single model is risky. An ensemble acts like a "committee" of models, making the final decision more
democratic, dependable, and less prone to failure from a single point of weakness.
● State-of-the-Art Performance: Ensemble methods are frequently used to win data science
competitions and build top-tier, state-of-the-art systems.
Common Ensemble Learning Methods
There are three main families of ensemble techniques:
1. Bagging (Bootstrap Aggregating): Focuses on reducing variance. It involves training multiple
models in parallel on different random subsets of the data and then averaging their predictions.
The Random Forest is the most famous example.
2. Boosting: Focuses on reducing bias. It builds models sequentially, where each new model is
trained to correct the errors made by the previous ones. AdaBoost and Gradient Boosting (like
XGBoost) are the most well-known examples.
3. Stacking (Stacked Generalization): This advanced technique combines diverse models. It
trains several different base models and then uses a "meta-model" to learn how to best combine
their predictions.

1. Bagging (Bootstrap Aggregating)

Bagging is an ensemble method designed primarily to combat overfitting by reducing the variance
of a model.
How Bagging Works
1. Bootstrapping: From an original dataset of 'N' samples, you create many new datasets, also
of size 'N', by sampling with replacement. This means some data points may be chosen multiple
times, and some (on average, 36.8%) may not be chosen at all. The data points left out are called
"Out-of-Bag" (OOB) samples and can be used for validation.
2. Parallel Training: A base learning algorithm (e.g., a decision tree) is trained independently on
each of these bootstrap samples. Since each model sees slightly different data, each will learn
slightly different patterns.
3. Aggregation: Once all models are trained, their predictions are combined.
o Classification: A majority vote is taken.
o Regression: The average of all predictions is taken.
A classic example of bagging is the Random Forest algorithm, which combines the predictions from
multiple decision trees.
Applications of Bagging
● IT: Improving network intrusion detection systems.
● Environment: Mapping wetland patterns in remote sensing.
● Finance: Fraud detection and credit risk assessment.
● Healthcare: Predicting medical conditions and for bioinformatics.

MACHINE LEARNING 2
Advantages and Disadvantages of Bagging
Advantages Disadvantages
Variance Reduction: Highly effective at
Increased Model Complexity: The ensemble can
reducing variance and preventing
be complex and difficult to interpret.
overfitting.
Limited Improvement for Stable Models: Offers
Easier Implementation: Libraries like
little benefit if the base model already has low
Scikit-learn make it simple to implement.
variance.
Robustness: Averaging makes the model Slower Training: Training multiple models can be
more stable and robust to noise. computationally expensive.

2. Boosting
Boosting trains models sequentially, with each model in the sequence focusing on fixing the
mistakes of the one before it. The goal is to convert a collection of weak learners into a single, highly
accurate strong learner.
How Boosting Works
1. Initial Model: A simple weak learner is trained on the original dataset. Initially, all data points
are given equal importance or "weight."
2. Identify Errors and Re-weigh: The model's predictions are compared to the actual outcomes.
The data points that the model misclassified are given increased weights.
3. Train the Next Model: A second weak learner is trained, but this time the training process is
influenced by the updated weights, forcing it to focus on the difficult, misclassified examples.
4. Repeat: This process is repeated for a specified number of iterations, with each new model
focusing on the errors left over by the ensemble so far.
5. Final Aggregation: The final prediction is a weighted sum of the predictions from all the
models. Models that performed better are given more say in the final decision.
Examples of boosting algorithms include AdaBoost (Adaptive Boosting) and Gradient Boosting
(e.g., XGBoost, LightGBM).
Applications of Boosting
● Healthcare: Identifying patients at risk for diseases or predicting cancer survival rates.
● IT: Used by search engines for page ranking and in image retrieval (e.g., the Viola-Jones
algorithm).
● Finance: Deployed for automated fraud detection and pricing analysis.

MACHINE LEARNING 3
Advantages and Disadvantages of Boosting
Advantages Disadvantages
Improved Accuracy: Combines weak models to Complex: The process of re-weighting and
create a final model with high accuracy, sequential training can be algorithmically
effectively reducing bias. complex.
Better Handling of Imbalanced Data: It Dependency: Each model is dependent on
naturally focuses on the harder-to-classify the previous one, which can cause errors to
(often minority class) data points. propagate.
Robustness to Overfitting (with care): While it Computationally Intensive: The sequential
can overfit if run for too many iterations, it's nature makes it difficult to parallelize, which
often more robust than a single model. can slow down training.

3. Stacking (Stacked Generalization)

Stacking is an advanced ensemble technique that combines the predictions from multiple different
types of models.
How Stacking Works
1. Train Base Models: Several different models (e.g., a Random Forest, a neural network, a
support vector machine) are trained on the full training dataset.
2. Create Meta-Dataset: The predictions from these base models are then used as input features
to train a new, final model.
3. Train Meta-Model: This final model, called the "meta-model" or "blender," learns how to best
combine the predictions from the base models to make the final prediction.
This approach leverages the unique strengths of each base algorithm, allowing the ensemble to
capture a much richer set of patterns.

Bagging vs. Boosting: A Comparison

Feature Bagging (e.g., Random Forest) Boosting (e.g., AdaBoost, XGBoost)
Sequential: Each model is trained
Training Parallel: All models are trained
one after another, as it depends on
Process independently and simultaneously.
the previous one's errors.
Reduce Variance: Solves the
Primary Reduce Bias: Solves the underfitting
overfitting problem by averaging
Goal problem by focusing on errors.
predictions.
Performance-Based Weight: Models
Model Equal Weight: Every base model gets
are weighted based on their
Weighting an equal vote or say.
accuracy.
Data Bootstrap Sampling: Each model is Weighted Sampling: The entire
Sampling trained on a random subset of data. dataset is used, but the weights of

MACHINE LEARNING 4
Feature Bagging (e.g., Random Forest) Boosting (e.g., AdaBoost, XGBoost)
samples are adjusted at each step.
Best for complex, high-variance
Best for simple, high-bias models (like
When to Use models (like deep decision trees)
shallow decision trees or "stumps").
that are prone to overfitting.

Deep Dive into Specific Algorithms

Random Forest
Random Forest is the most popular implementation of the bagging technique. It's an ensemble of
decision trees with an extra twist to ensure the trees are diverse.
● How it Works: Like standard bagging, it builds many trees on bootstrap samples. However, it
also introduces randomness into the tree-building process itself. At each node, when deciding which
feature to split on, a regular decision tree considers all features. A Random Forest only considers a
random subset of the features.
● Why it Works: This dual randomness (in both samples and features) is crucial. It ensures the
trees in the forest are not highly correlated. By forcing each tree to be different, the Random Forest
maximizes the variance-reduction benefits of bagging.
● Key Hyperparameters:
o n_estimators: The number of trees in the forest.
o max_features: The size of the random subset of features to consider at each split.
o max_depth: The maximum depth of each tree.
o oob_score: Uses the "out-of-bag" samples for cross-validation to evaluate performance.
AdaBoost (Adaptive Boosting)
AdaBoost is the original boosting algorithm. Its name stands for Adaptive Boosting because it
adaptively adjusts the weights of the data samples at each iteration.
● The Algorithm:
1. Initialize Weights: Start by assigning an equal weight to every sample.
2. Iterate:
a. Train a weak learner (e.g., a decision stump) on the weighted data.
b. Calculate the model's error rate (the sum of weights of misclassified samples).
c. Calculate the model's "say" or weight in the final ensemble (lower error = higher weight).
d. Update Sample Weights: Increase the weights of misclassified samples and decrease the weights
of correct ones.
3. Final Prediction: The final prediction is a weighted vote of all the weak learners.

MACHINE LEARNING 5
● Key Vulnerability: AdaBoost is sensitive to noisy data and outliers. Because it aggressively
increases the weight of misclassified samples, an outlier can attract a huge amount of weight,
distorting subsequent models.
XGBoost (Extreme Gradient Boosting)
XGBoost is a modern, highly optimized implementation of the gradient boosting framework. At each
step, a new model is trained to predict the residual errors (the difference between the actual values
and the current ensemble's predictions) of the previous ensemble.
● Key Features:
o Regularization: Includes L1 (Lasso) and L2 (Ridge) regularization terms to discourage
complexity and prevent overfitting.
o Hardware Optimization (Parallel Processing): While the boosting process is sequential, the
training of each individual tree can be parallelized, making it exceptionally fast.
o Built-in Handling of Missing Values: Can learn a default direction for missing values during
training, so you don't need to impute them first.
o Tree Pruning: Employs advanced pruning techniques to remove splits that don't provide a
positive gain, further controlling complexity.
o Built-in Cross-Validation: Allows you to run cross-validation at each iteration to find the
optimal number of trees.
o Feature Importance: Can provide scores indicating the relative importance of each feature.

MACHINE LEARNING 6
MACHINE LEARNING 1

Ensemble Learning
No ratings yet
Ensemble Learning
40 pages
Understanding Ensemble Learning Techniques
No ratings yet
Understanding Ensemble Learning Techniques
45 pages
Understanding Ensemble Learning Techniques
No ratings yet
Understanding Ensemble Learning Techniques
51 pages
Unit 5 Machine Learning
No ratings yet
Unit 5 Machine Learning
20 pages
Ensemble Methods in Machine Learning
No ratings yet
Ensemble Methods in Machine Learning
10 pages
Ensemble Learning Techniques Explained
No ratings yet
Ensemble Learning Techniques Explained
13 pages
Understanding Ensemble Learning Models
No ratings yet
Understanding Ensemble Learning Models
36 pages
Ensemble
No ratings yet
Ensemble
10 pages
Unit 5 Machine Learning
No ratings yet
Unit 5 Machine Learning
20 pages
Overview of Ensemble Learning Methods
No ratings yet
Overview of Ensemble Learning Methods
2 pages
Benefits of Ensemble Learning Techniques
100% (1)
Benefits of Ensemble Learning Techniques
12 pages
Ensemble Learning Methods Overview
No ratings yet
Ensemble Learning Methods Overview
19 pages
Ensemble Learning: Bagging, Boosting, Stacking
No ratings yet
Ensemble Learning: Bagging, Boosting, Stacking
15 pages
Unit 4
No ratings yet
Unit 4
125 pages
Ensemble Learning Methods Explained
No ratings yet
Ensemble Learning Methods Explained
9 pages
Types of Ensemble Learning Models
No ratings yet
Types of Ensemble Learning Models
12 pages
Ensemble Methods in Machine Learning
No ratings yet
Ensemble Methods in Machine Learning
11 pages
Overview of Ensemble Methods in ML
No ratings yet
Overview of Ensemble Methods in ML
31 pages
Ensemble Learning Methods Explained
No ratings yet
Ensemble Learning Methods Explained
52 pages
Ensemble Learning for Disease Diagnosis
No ratings yet
Ensemble Learning for Disease Diagnosis
30 pages
ML Assignment
No ratings yet
ML Assignment
15 pages
Understanding Ensemble Learning Techniques
No ratings yet
Understanding Ensemble Learning Techniques
67 pages
Bagging vs Boosting in Ensemble Learning
No ratings yet
Bagging vs Boosting in Ensemble Learning
50 pages
Understanding Ensemble Learning Methods
No ratings yet
Understanding Ensemble Learning Methods
41 pages
Ensemble Learning & Meta Learning
No ratings yet
Ensemble Learning & Meta Learning
19 pages
Machine Learning Ensemble Techniques
No ratings yet
Machine Learning Ensemble Techniques
15 pages
Ensemble Learning
No ratings yet
Ensemble Learning
27 pages
Ensemble Learning in Cyber Security AI
No ratings yet
Ensemble Learning in Cyber Security AI
39 pages
Ensemble Learning Methods Explained
No ratings yet
Ensemble Learning Methods Explained
9 pages
Random Forest and Ensembling
No ratings yet
Random Forest and Ensembling
31 pages
A-AI Mod 5 - Ensemble Learning
No ratings yet
A-AI Mod 5 - Ensemble Learning
11 pages
Unit - 4
No ratings yet
Unit - 4
18 pages
Combining Machine Learning Models Explained
No ratings yet
Combining Machine Learning Models Explained
9 pages
Understanding Ensemble Learning Methods
No ratings yet
Understanding Ensemble Learning Methods
9 pages
Ensemble Techniques in Machine Learning
No ratings yet
Ensemble Techniques in Machine Learning
2 pages
Ensemble Learning Techniques Explained
No ratings yet
Ensemble Learning Techniques Explained
18 pages
Ensemble Learning Techniques Explained
No ratings yet
Ensemble Learning Techniques Explained
14 pages
Understanding Ensemble Learning in AIML
No ratings yet
Understanding Ensemble Learning in AIML
35 pages
Understanding Ensemble Methods in ML
No ratings yet
Understanding Ensemble Methods in ML
20 pages
Types of Ensemble Methods Explained
No ratings yet
Types of Ensemble Methods Explained
17 pages
Ensemble Learning Techniques Explained
No ratings yet
Ensemble Learning Techniques Explained
35 pages
Ensemble Learning Techniques in ML
No ratings yet
Ensemble Learning Techniques in ML
18 pages
Characteristics of Predictive Models
No ratings yet
Characteristics of Predictive Models
25 pages
Ensemble Learning Techniques Explained
No ratings yet
Ensemble Learning Techniques Explained
5 pages
Ensemble Learning: Bagging & Boosting Techniques
No ratings yet
Ensemble Learning: Bagging & Boosting Techniques
70 pages
Ensemble Learning Techniques Explained
No ratings yet
Ensemble Learning Techniques Explained
23 pages
Ensemble Learning Techniques Explained
No ratings yet
Ensemble Learning Techniques Explained
17 pages
Ensemble Classifiers in Machine Learning
No ratings yet
Ensemble Classifiers in Machine Learning
16 pages
Ensemble Methods in Machine Learning
No ratings yet
Ensemble Methods in Machine Learning
2 pages
Ensemble Methods: Bagging & Boosting Explained
No ratings yet
Ensemble Methods: Bagging & Boosting Explained
4 pages
Unit IV CS3491 AIML Verified Final
No ratings yet
Unit IV CS3491 AIML Verified Final
36 pages
BCD to 7-Segment Decoder in VHDL
No ratings yet
BCD to 7-Segment Decoder in VHDL
4 pages
UPRTOU MBA-MCA Admission Form 2015
No ratings yet
UPRTOU MBA-MCA Admission Form 2015
12 pages
IoT Home Automation and Security Insights
No ratings yet
IoT Home Automation and Security Insights
8 pages
Experiencing MIS Test Bank Overview
No ratings yet
Experiencing MIS Test Bank Overview
20 pages
Nanosheet FETs: Advancing Moore's Law
No ratings yet
Nanosheet FETs: Advancing Moore's Law
3 pages
COP 4530 Spring 2014 Practice Exam
No ratings yet
COP 4530 Spring 2014 Practice Exam
15 pages
Xquery Tutorial
No ratings yet
Xquery Tutorial
57 pages
USACE Advanced Modeling Review Guide V2.0
No ratings yet
USACE Advanced Modeling Review Guide V2.0
51 pages
Digital and Mobile Forensics Course Overview
No ratings yet
Digital and Mobile Forensics Course Overview
2 pages
PPA Guidelines for Suppliers at SEBN
No ratings yet
PPA Guidelines for Suppliers at SEBN
40 pages
Process and Concurrency Management in OS
No ratings yet
Process and Concurrency Management in OS
12 pages
Voice-Controlled Hospital Bed System
No ratings yet
Voice-Controlled Hospital Bed System
3 pages
Truchas: Multi-Physics Casting Simulation
No ratings yet
Truchas: Multi-Physics Casting Simulation
6 pages
Harga Videotron 5x3 Outdoor
No ratings yet
Harga Videotron 5x3 Outdoor
2 pages
Lenovo P620 RHEL 8 Driver Guide
No ratings yet
Lenovo P620 RHEL 8 Driver Guide
49 pages
AI-Based Gas Detection and Identification
No ratings yet
AI-Based Gas Detection and Identification
21 pages
Tech Mahindra Service Certificate for Dhakane
No ratings yet
Tech Mahindra Service Certificate for Dhakane
1 page
Android Privacy Tools Overview
No ratings yet
Android Privacy Tools Overview
84 pages
Business Analytics in Managerial Accounting
No ratings yet
Business Analytics in Managerial Accounting
16 pages
AOD414 N-Channel MOSFET Specifications
No ratings yet
AOD414 N-Channel MOSFET Specifications
6 pages
UC-One Connect Desktop Installation Guide
No ratings yet
UC-One Connect Desktop Installation Guide
3 pages
Data Cleaning and Transformation Techniques
No ratings yet
Data Cleaning and Transformation Techniques
7 pages
RET Antenna TD-LTE Maintenance Guide
No ratings yet
RET Antenna TD-LTE Maintenance Guide
24 pages
DeepTrading with TensorFlow II Guide
No ratings yet
DeepTrading with TensorFlow II Guide
9 pages
KF H 046 024483 00 Usp Instruction For Use
No ratings yet
KF H 046 024483 00 Usp Instruction For Use
62 pages
Overview of Red-Black Trees
No ratings yet
Overview of Red-Black Trees
16 pages
ICSE Class 10 Java Notes Summary
No ratings yet
ICSE Class 10 Java Notes Summary
6 pages
Introduction to JavaFX GUI Development
No ratings yet
Introduction to JavaFX GUI Development
93 pages
Grade 3 Self-Learning Modules Q4 Report
No ratings yet
Grade 3 Self-Learning Modules Q4 Report
13 pages
Genuine Temporary Entrant Visa Application
No ratings yet
Genuine Temporary Entrant Visa Application
6 pages