0% found this document useful (0 votes)

39 views31 pages

Ensemble Learning for Data Scientists

Uploaded by

ilham.hasib

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

39 views31 pages

Ensemble Learning for Data Scientists

Uploaded by

ilham.hasib

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 31

17 Ensemble Learning

- Dr. Sifat Momen (SfM1)

Learning goals
• After this presentation, you should be able to
• Appreciate the use of ensemble techniques
• Understand the general idea of Ensemble Approach
• Understand the idea of Bagging and Pasting
• Understand OOB evaluation
• Understand and Apply Random Forest Algorithm
• Understand Boosting Approach
• Understand AdaBoost Algorithm

08/06/2024 Slides by Dr. Sifat Momen 2

Jupyter Notebook
• Please note that there is an associated Jupyter notebook with this
presentation
• Please use both in parallel for optimal understanding

08/06/2024 Slides by Dr. Sifat Momen 3

Wisdom of the crowd

08/06/2024 Slides by Dr. Sifat Momen 4

Ensemble Methods
• Construct a set of base classifiers learned from the training data

• Predict class label of test records by combining the predictions made

by multiple classifiers (e.g., by taking majority vote)

5
Necessary Conditions for Ensemble Methods

• Ensemble Methods work better than a single base classifier if:

1. All base classifiers are independent of each other
2. All base classifiers perform better than random guessing
(error rate < 0.5 for binary classification)

6
General Approach of Ensemble Learning

Using majority vote or

weighted majority vote
(weighted according to their
accuracy or relevance)

7
Constructing Ensemble Classifiers
• By manipulating training set
• Example: bagging, boosting, random forests

• By manipulating input features

• Example: random forests

• By manipulating class labels

• Example: error-correcting output coding

• By manipulating learning algorithm

• Example: injecting randomness in the initial weights of ANN

8
Voting Classifiers – Training Diverse
Classifiers

• Somewhat surprisingly, this voting classifier often achieves a higher accuracy than the best classifier in the
ensemble.
• In fact, even if each classifier is a weak learner (meaning it does only slightly better than random guessing),
the ensemble can still be a strong learner (achieving high accuracy), provided there are a sufficient number
of weak learners in the ensemble and they are sufficiently diverse.
• If all classifiers are able to estimate class probabilities (i.e., if they all have a predict_proba() method), then
you can tell Scikit-Learn to predict the class with the highest class probability, averaged over all the
individual classifiers. This is called soft voting.
08/06/2024 Slides by Dr. Sifat Momen 9
Voting Classifiers – Bagging and Pasting
• One way to get a diverse set of classifiers is to use very different
training algorithms, as just discussed.
• Another approach is to use the same training algorithm for every
predictor but train them on different random subsets of the training
set.
• When sampling is performed with replacement, this method is called
bagging (short for bootstrap aggregating ).
• When sampling is performed without replacement, it is called pasting

08/06/2024 Slides by Dr. Sifat Momen 10

Voting Classifiers – Bagging and Pasting

08/06/2024 Slides by Dr. Sifat Momen 11

Sampling with replacement

08/06/2024 Slides by Dr. Sifat Momen 12

Sampling without replacement

08/06/2024 Slides by Dr. Sifat Momen 13

Out of Bag Evaluation
• With bagging, some training instances may be sampled several times
for any given predictor, while others may not be sampled at all.
• it can be shown mathematically that only about 63% of the training
instances are sampled on average for each predictor.
• Probability of a training instance being selected in a bootstrap sample
is:
1 – (1 - 1/n)n (n: number of training instances)
~0.632 when n is large

08/06/2024 Slides by Dr. Sifat Momen 14

Out of Bag Evaluation
Probability of a training instance being selected
1.2

1 The remaining 37% of the training

instances that are not sampled are called
0.8
out-of-bag (OOB) instances. Note that
Probability

0.6 they are not the same 37% for all

predictors.
0.4

0.2

0
0 5 10 15 20 25 30 35 40 45

08/06/2024 Slides by Dr. Sifat Momen 15

Out of Bag Evaluation

08/06/2024 Slides by Dr. Sifat Momen 16

Out of Bag Evaluation
• A bagging ensemble can be evaluated using OOB instances, without
the need for a separate validation set: indeed, if there are enough
estimators, then each instance in the training set will likely be an OOB
instance of several estimators, so these estimators can be used to
make a fair ensemble prediction for that instance.
• Once you have a prediction for each instance, you can compute the
ensemble’s prediction accuracy (or any other metric).
• In Scikit-Learn, you can set oob_score=True when creating a
BaggingClassifier to request an automatic OOB evaluation after
training.

08/06/2024 Slides by Dr. Sifat Momen 17

Bagging Example
• Consider 1-dimensional data set:
Original Data:
x 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
y 1 1 1 -1 -1 -1 -1 1 1 1

• Classifier is a decision stump (decision tree of size 1)

• Decision rule: x  k versus x > k
• Split point k is chosen based on entropy xk

True False

yleft yright
18
Bagging Example
Bagging Round 1:
x 0.1 0.2 0.2 0.3 0.4 0.4 0.5 0.6 0.9 0.9 x <= 0.35  y = 1
y 1 1 1 1 -1 -1 -1 -1 1 1 x > 0.35  y = -1

Bagging Round 2:
x 0.1 0.2 0.3 0.4 0.5 0.5 0.9 1 1 1
y 1 1 1 -1 -1 -1 1 1 1 1

Bagging Round 3:
x 0.1 0.2 0.3 0.4 0.4 0.5 0.7 0.7 0.8 0.9
y 1 1 1 -1 -1 -1 -1 -1 1 1

Bagging Round 4:
x 0.1 0.1 0.2 0.4 0.4 0.5 0.5 0.7 0.8 0.9
y 1 1 1 -1 -1 -1 -1 -1 1 1

Bagging Round 5:
x 0.1 0.1 0.2 0.5 0.6 0.6 0.6 1 1 1
y 1 1 1 -1 -1 -1 -1 1 1 1

19
Bagging Example
Bagging Round 1:
x 0.1 0.2 0.2 0.3 0.4 0.4 0.5 0.6 0.9 0.9 x <= 0.35  y = 1
y 1 1 1 1 -1 -1 -1 -1 1 1 x > 0.35  y = -1

Bagging Round 2:
x 0.1 0.2 0.3 0.4 0.5 0.5 0.9 1 1 1 x <= 0.7  y = 1
y 1 1 1 -1 -1 -1 1 1 1 1 x > 0.7  y = 1

Bagging Round 3:
x 0.1 0.2 0.3 0.4 0.4 0.5 0.7 0.7 0.8 0.9 x <= 0.35  y = 1
y 1 1 1 -1 -1 -1 -1 -1 1 1 x > 0.35  y = -1

Bagging Round 4:
x 0.1 0.1 0.2 0.4 0.4 0.5 0.5 0.7 0.8 0.9 x <= 0.3  y = 1
y 1 1 1 -1 -1 -1 -1 -1 1 1 x > 0.3  y = -1

Bagging Round 5:
x 0.1 0.1 0.2 0.5 0.6 0.6 0.6 1 1 1 x <= 0.35  y = 1
y 1 1 1 -1 -1 -1 -1 1 1 1 x > 0.35  y = -1

20
Bagging Example
Bagging Round 6:
x 0.2 0.4 0.5 0.6 0.7 0.7 0.7 0.8 0.9 1 x <= 0.75  y = -1
y 1 -1 -1 -1 -1 -1 -1 1 1 1 x > 0.75  y = 1

Bagging Round 7:
x 0.1 0.4 0.4 0.6 0.7 0.8 0.9 0.9 0.9 1 x <= 0.75  y = -1
y 1 -1 -1 -1 -1 1 1 1 1 1 x > 0.75  y = 1

Bagging Round 8:
x 0.1 0.2 0.5 0.5 0.5 0.7 0.7 0.8 0.9 1 x <= 0.75  y = -1
y 1 1 -1 -1 -1 -1 -1 1 1 1 x > 0.75  y = 1

Bagging Round 9:
x 0.1 0.3 0.4 0.4 0.6 0.7 0.7 0.8 1 1 x <= 0.75  y = -1
y 1 1 -1 -1 -1 -1 -1 1 1 1 x > 0.75  y = 1

Bagging Round 10:

x 0.1 0.1 0.1 0.1 0.3 0.3 0.8 0.8 0.9 0.9 x <= 0.05  y = 1
y 1 1 1 1 1 1 1 1 1 1 x > 0.05  y = 1

21
Bagging Example
• Summary of Trained Decision Stumps:
Round Split Point Left Class Right Class
1 0.35 1 -1
2 0.7 1 1
3 0.35 1 -1
4 0.3 1 -1
5 0.35 1 -1
6 0.75 -1 1
7 0.75 -1 1
8 0.75 -1 1
9 0.75 -1 1
10 0.05 1 1

22
Bagging Example
• Use majority vote (sign of sum of predictions) to determine class of
ensemble classifier
Round x=0.1 x=0.2 x=0.3 x=0.4 x=0.5 x=0.6 x=0.7 x=0.8 x=0.9 x=1.0
1 1 1 1 -1 -1 -1 -1 -1 -1 -1
2 1 1 1 1 1 1 1 1 1 1
3 1 1 1 -1 -1 -1 -1 -1 -1 -1
4 1 1 1 -1 -1 -1 -1 -1 -1 -1
5 1 1 1 -1 -1 -1 -1 -1 -1 -1
6 -1 -1 -1 -1 -1 -1 -1 1 1 1
7 -1 -1 -1 -1 -1 -1 -1 1 1 1
8 -1 -1 -1 -1 -1 -1 -1 1 1 1
9 -1 -1 -1 -1 -1 -1 -1 1 1 1
10 1 1 1 1 1 1 1 1 1 1
Predicted Sum 2 2 2 -6 -6 -6 -6 2 2 2
Class Sign 1 1 1 -1 -1 -1 -1 1 1 1

• Bagging can also increase the complexity (representation capacity) of

simple classifiers such as decision stumps
23
Random Forest Algorithm
• Construct an ensemble of decision trees by manipulating training set
as well as features
• Use bootstrap sample to train every decision tree (similar to Bagging)
• Use the following tree induction algorithm:
• At every internal node of decision tree, randomly sample p attributes for selecting split
criterion
• Repeat this procedure until all leaves are pure (unpruned tree)

24
Boosting (originally called hypothesis
boosting)
• Refers to an ensemble method that can combine several weak
learners into a strong learner.
• The general idea of boosting is to train predictors sequentially, each
trying to correct its predecessor.
• AdaBoost (adaptive boosting)
• Gradient Boosting
• XGBoost
• LightGBM (Light gradient boosting machine)

08/06/2024 Slides by Dr. Sifat Momen 25

AdaBoost
• One way for a new predictor to correct its predecessor is to pay a bit
more attention to the training instances that the predecessor
underfit.
• This results in new predictors focusing more and more on the hard
cases.
• This is the technique used by AdaBoost

08/06/2024 Slides by Dr. Sifat Momen 26

AdaBoost

08/06/2024 Slides by Dr. Sifat Momen 27

AdaBoost in detail (Slightly different than
that in the textbook)

08/06/2024 Slides by Dr. Sifat Momen 28

Voting Power
Voting Power
2.5

1.5
Voting power or predictor's weight

0.5

0
0 0.2 0.4 0.6 0.8 1 1.2

-0.5

-1

-1.5

-2

-2.5

error

08/06/2024 Slides by Dr. Sifat Momen 29

How is the overall classifier assembled
• The overall classifier is assembled in series of rounds
• For each round:
• Pick the best “weak” classifier, h(x), to add to the overall classifier, H(x)
• Best – classifier that makes the fewest errors
• Assign the classifier a voting power
• Append the term αh(x) to our overall classifier

08/06/2024 Slides by Dr. Sifat Momen 30

AdaBoost classifier

Check the corresponding Excel file to see how AdaBoost classifier

works

08/06/2024 Slides by Dr. Sifat Momen 31

CH 7 - Ensemble Learning and Random Forests
No ratings yet
CH 7 - Ensemble Learning and Random Forests
78 pages
Unit 3
No ratings yet
Unit 3
63 pages
VTU Module-4 Chapter-2 Ensemble Learning and Random Forests
No ratings yet
VTU Module-4 Chapter-2 Ensemble Learning and Random Forests
61 pages
Bagging+Boosting+Gradient Boosting
100% (1)
Bagging+Boosting+Gradient Boosting
48 pages
12 Ensemble Model
No ratings yet
12 Ensemble Model
90 pages
Machine Learning: Ensemble Methods
No ratings yet
Machine Learning: Ensemble Methods
54 pages
Lecture 10 Ensemble Methods
No ratings yet
Lecture 10 Ensemble Methods
69 pages
Unit 3
No ratings yet
Unit 3
59 pages
8 Ensembles
No ratings yet
8 Ensembles
94 pages
Machine Learning and Data Mining: Prof. Alexander Ihler Fall 2012
No ratings yet
Machine Learning and Data Mining: Prof. Alexander Ihler Fall 2012
36 pages
Ensemble Learning
No ratings yet
Ensemble Learning
52 pages
Module 7 - Ensemble Learning
No ratings yet
Module 7 - Ensemble Learning
41 pages
Ensemble Learning for Data Scientists
No ratings yet
Ensemble Learning for Data Scientists
41 pages
Ensemble Methods in Machine Learning
No ratings yet
Ensemble Methods in Machine Learning
17 pages
Chapter07 Ensemble Learning
No ratings yet
Chapter07 Ensemble Learning
21 pages
Ensemble Learning Techniques in ML
No ratings yet
Ensemble Learning Techniques in ML
99 pages
ML Exp4 Part A
No ratings yet
ML Exp4 Part A
14 pages
Ensemble Learning
No ratings yet
Ensemble Learning
35 pages
Ensembles Learning
No ratings yet
Ensembles Learning
16 pages
Boosting
No ratings yet
Boosting
28 pages
Week 11 EnsembleLearning
No ratings yet
Week 11 EnsembleLearning
34 pages
Lecture Slide 12
No ratings yet
Lecture Slide 12
22 pages
Module 5,1 Ensemble - Bagging, RF, Boosting
No ratings yet
Module 5,1 Ensemble - Bagging, RF, Boosting
66 pages
Ensembles 1
No ratings yet
Ensembles 1
4 pages
Ensemble Methods
No ratings yet
Ensemble Methods
31 pages
Unit-3 ML P (1) PPTs by DR KSR
No ratings yet
Unit-3 ML P (1) PPTs by DR KSR
21 pages
Ensemble Classifiers
100% (1)
Ensemble Classifiers
37 pages
Week 11
No ratings yet
Week 11
16 pages
CSE 445 - Lecture 7 - Ensemble Learning
No ratings yet
CSE 445 - Lecture 7 - Ensemble Learning
17 pages
Ensemble Learning in Machine Learning
No ratings yet
Ensemble Learning in Machine Learning
63 pages
ML Unit-1 em
No ratings yet
ML Unit-1 em
61 pages
ML Unit-3 Part-1
No ratings yet
ML Unit-3 Part-1
17 pages
Jntuk r20 ML Unit-III
No ratings yet
Jntuk r20 ML Unit-III
28 pages
Evaluating Machine Learning Models
100% (2)
Evaluating Machine Learning Models
10 pages
5 - EnsembleModeling
No ratings yet
5 - EnsembleModeling
80 pages
Ensemble Classifiers Overview
No ratings yet
Ensemble Classifiers Overview
37 pages
Ensemble Classification
No ratings yet
Ensemble Classification
25 pages
Bagging and Boosting
No ratings yet
Bagging and Boosting
8 pages
An Introduction of Ensemble Learning
100% (1)
An Introduction of Ensemble Learning
40 pages
ML8 Ensembles
No ratings yet
ML8 Ensembles
31 pages
Lec06 - Ensembling Methods Bagging Boosting
No ratings yet
Lec06 - Ensembling Methods Bagging Boosting
48 pages
Unit 4 ML
No ratings yet
Unit 4 ML
25 pages
ML Lecture 15 Ensemble
No ratings yet
ML Lecture 15 Ensemble
27 pages
M05 Ensemble
No ratings yet
M05 Ensemble
42 pages
ML Mod 5.1
No ratings yet
ML Mod 5.1
18 pages
Unit V - Multiple Learners
No ratings yet
Unit V - Multiple Learners
54 pages
Combining Classifiers: Outline
No ratings yet
Combining Classifiers: Outline
15 pages
Module 2
No ratings yet
Module 2
34 pages
ML Unit 3 (DS)
No ratings yet
ML Unit 3 (DS)
31 pages
Ensemble Methods
No ratings yet
Ensemble Methods
19 pages
Bagging vs Boosting in Ensemble Learning
No ratings yet
Bagging vs Boosting in Ensemble Learning
40 pages
Ensemble Learning
No ratings yet
Ensemble Learning
16 pages
Ensemble Techniques Presentation
No ratings yet
Ensemble Techniques Presentation
17 pages
Unit 2
No ratings yet
Unit 2
13 pages
Ensemble Methods in Machine Learning
No ratings yet
Ensemble Methods in Machine Learning
54 pages
Lecture 2.1 - AML
No ratings yet
Lecture 2.1 - AML
32 pages
Ensemble Final
No ratings yet
Ensemble Final
41 pages
SSRN Id3890338
No ratings yet
SSRN Id3890338
20 pages
Publichealth V10i1e53330 App1
No ratings yet
Publichealth V10i1e53330 App1
15 pages
Boosting Algorithms: Regularization, Prediction and Model Fitting
No ratings yet
Boosting Algorithms: Regularization, Prediction and Model Fitting
29 pages
Machine Learning With Tree-Based Models in R - Slides
No ratings yet
Machine Learning With Tree-Based Models in R - Slides
173 pages
Thesis. Facial Recognition Security System
0% (1)
Thesis. Facial Recognition Security System
44 pages
Machine Learning Enhances Diabetes Risk Stratification
No ratings yet
Machine Learning Enhances Diabetes Risk Stratification
17 pages
Missing Person ID via Face Recognition
No ratings yet
Missing Person ID via Face Recognition
7 pages
IET Communications - 2020 - Safara - Improved Intrusion Detection Method For Communication Networks Using Association Rule
No ratings yet
IET Communications - 2020 - Safara - Improved Intrusion Detection Method For Communication Networks Using Association Rule
6 pages
Data Science for Customer Segmentation
No ratings yet
Data Science for Customer Segmentation
7 pages
UHPC-NC Bond Strength Prediction Model
No ratings yet
UHPC-NC Bond Strength Prediction Model
19 pages
Sentiment Analysis of Fuel Price Hikes
No ratings yet
Sentiment Analysis of Fuel Price Hikes
10 pages
Ensemble Learning
No ratings yet
Ensemble Learning
12 pages
Accuracy and Error Measures
No ratings yet
Accuracy and Error Measures
46 pages
2016.random Forest in Remote Sensing A Review of Applications and Future
No ratings yet
2016.random Forest in Remote Sensing A Review of Applications and Future
8 pages
Texture-Based Airport Runway Detection: Ö. Aytekin, U. Zöngür, and U. Halici
No ratings yet
Texture-Based Airport Runway Detection: Ö. Aytekin, U. Zöngür, and U. Halici
8 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
42 pages
Machine Learning Model Validation
No ratings yet
Machine Learning Model Validation
50 pages
Decision Tree Metrics and Concepts
No ratings yet
Decision Tree Metrics and Concepts
28 pages
Data-Driven Approach For Intelligent Classification of Tunnel Surrounding Rock Using Integrated Fractal and Machine Learning Methods
No ratings yet
Data-Driven Approach For Intelligent Classification of Tunnel Surrounding Rock Using Integrated Fractal and Machine Learning Methods
22 pages
Boosting Algorithms As Gradient Descent
No ratings yet
Boosting Algorithms As Gradient Descent
7 pages
ML Unit-3
No ratings yet
ML Unit-3
23 pages
AdaBoost vs Bagging: 1996 Study
No ratings yet
AdaBoost vs Bagging: 1996 Study
9 pages
Boosting Algorithms in Machine Learning
100% (1)
Boosting Algorithms in Machine Learning
41 pages
Murat Durmus - A Primer To The 42 Most Commonly Used Machine Learning Algorithms (With Code Samples) - Leanpub (2023)
No ratings yet
Murat Durmus - A Primer To The 42 Most Commonly Used Machine Learning Algorithms (With Code Samples) - Leanpub (2023)
192 pages
CS 189 Cheat Sheet Overview
No ratings yet
CS 189 Cheat Sheet Overview
2 pages
A New Malware Detection Model Using
No ratings yet
A New Malware Detection Model Using
9 pages
Projectreport
No ratings yet
Projectreport
4 pages
ML On Env Issuex
No ratings yet
ML On Env Issuex
17 pages
Network Intrusion Detection System Analytics Using Memory Based Learning Approaches
No ratings yet
Network Intrusion Detection System Analytics Using Memory Based Learning Approaches
32 pages
AI Stress Detection via Wearables
No ratings yet
AI Stress Detection via Wearables
11 pages

Ensemble Learning for Data Scientists

Uploaded by

Ensemble Learning for Data Scientists

Uploaded by

17 Ensemble Learning

- Dr. Sifat Momen (SfM1)

08/06/2024 Slides by Dr. Sifat Momen 2

08/06/2024 Slides by Dr. Sifat Momen 3

08/06/2024 Slides by Dr. Sifat Momen 4

• Predict class label of test records by combining the predictions made

• Ensemble Methods work better than a single base classifier if:

Using majority vote or

• By manipulating input features

• By manipulating class labels

• By manipulating learning algorithm

08/06/2024 Slides by Dr. Sifat Momen 10

08/06/2024 Slides by Dr. Sifat Momen 11

08/06/2024 Slides by Dr. Sifat Momen 12

08/06/2024 Slides by Dr. Sifat Momen 13

08/06/2024 Slides by Dr. Sifat Momen 14

1 The remaining 37% of the training

0.6 they are not the same 37% for all

08/06/2024 Slides by Dr. Sifat Momen 15

08/06/2024 Slides by Dr. Sifat Momen 16

08/06/2024 Slides by Dr. Sifat Momen 17

• Classifier is a decision stump (decision tree of size 1)

Bagging Round 10:

• Bagging can also increase the complexity (representation capacity) of

08/06/2024 Slides by Dr. Sifat Momen 25

08/06/2024 Slides by Dr. Sifat Momen 26

08/06/2024 Slides by Dr. Sifat Momen 27

08/06/2024 Slides by Dr. Sifat Momen 28

08/06/2024 Slides by Dr. Sifat Momen 29

08/06/2024 Slides by Dr. Sifat Momen 30

Check the corresponding Excel file to see how AdaBoost classifier

08/06/2024 Slides by Dr. Sifat Momen 31

You might also like