Week+1+ Lecture+Slide+and+Notes
Week+1+ Lecture+Slide+and+Notes
Training Data
[email protected]
EUVBQS86XL
Model-1 Model-2 Model-3 … Model-n
Test Data
Combined Prediction
This file is meant for personal use by [email protected] only.
Proprietary content.
Sharing ©orGreat Learning.
publishing All Rights
the contents Reserved.
in part Unauthorized
or full is liable use or distribution
for legal action.
Ensemble Methods
.. X X X
……. …….
…
… …….. …
Y …
N …
......
. . .. . . . Model
….. . . .
Bagging
...... M1
. . .. . . .
[email protected]
EUVBQS86XL
….. . . .
Combine Predict
. . .. . ....... M2
.. . . .. . . .
….. ….. . . .
......
. . .. . . . M3
….. . . .
Dataset
C, A, D, B A, B, C, D C, B, C, A, D, B
Small
Large
[email protected]
EUVBQS86XL n n
C, A, D, B A, B, C, D C, B, C, A
A, D, B, C
A, D, B, C
n
A, D, A, A
B, A, C, D
Data
Tree
…. …. ….. ...
…...
… …. …… …
…...
[email protected]
EUVBQS86XL
… …. ….. ...
…..
. …… ….. ..
…..
. …. ….. ..
● Decision trees are very sensitive to even small changes in the data - usually
called unstable.
● Then for prediction we could use the mean for regression trees and mode
for classification trees
[email protected]
EUVBQS86XL
● While individual trees are tend to over-fit training data, averaging corrects
this.
○ Generate new training subsets of the original, each of the same size
(usually the size of the data) by sampling with replacement.
[email protected]
EUVBQS86XL
A, B, C, D, E, F ,J
1
2
3
4
…
..
n
[email protected]
EUVBQS86XL
A, B, C, D, E, F J A, B, C, D, E, F J A, B, C, D, E, F J
67 17 32
43 14 95
32 32 32
… 47 64
… … …
.. .. ..
… … …
1 D2 Dk
D
A, B, C, D, E, F, J….
Say M = 10 => A B C D J
1
[email protected]
EUVBQS86XL
M=10 => High Tree Correlation
Good