0% found this document useful (0 votes)

61 views48 pages

Classification Trees

Uploaded by

a4584851

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

61 views48 pages

Classification Trees

Uploaded by

a4584851

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

DECISION TREES

Contents
• Decision Trees
• ID3 Algorithm
•C 4.5 Algorithm
• Random Forest
Decision Trees
• Decision tree is a simple but powerful learning Paradigm.
• It is a type of classification algorithm for supervised learning.
•A decision tree is a tree in which each branch node represents a
choice between a number of alternatives and each leaf node
represents a decision.
•A node with outgoing edges is called an internal or test node.
All other nodes are called leaves (also known as terminal or
decision nodes).
Decision Trees
Construction of decision tree
1) First test all attributes and select the one that will function
as the best root.
2) Break-up the training set into subsets based on the branches
of the root node.
3) Test the remaining attributes to see which ones fit best
underneath the branches of the root node.
4) Continue this process for all other branches until
Decision Trees
I. All the examples of a subset are of one type.
II. There are no examples left.
III. There are no more attributes left.
Decision Tree Example
• Given a collection of shape and colour example, learn a
decision tree that represents it. Use this representation to
classify new examples.
Decision Trees: The Representation
• Decision Trees are classifiers for instances represented as
features vectors. (color = ;shape = ;label = )
• Nodes are tests for feature values;
• There is one branch for each value of the feature
• Leaves specify the categories (labels)
• Can categorize instances into multiple disjoint categories – multi-class
(color = RED ;shape = triangle)
Evaluation of a Color Learning a
Decision Tree Decision Tree
Blue red Green
Shape Shape
B
triangle circle
square circle
square
B B A
A C 8
Binary classified Decision Trees
They can represent any Boolean function
• Can be rewritten as rules in Disjunctive Normal Form (DNF)
• green  square →positive
• blue  circle → positive
• blue  square → positive
• The disjunction of these rules is equivalent to the Decision Tree
Color

Blue red Green

Shape Shape
+
triangle circle
square circle
square
- + -
+ +
9
Limitation of Decision Trees
• Learning a tree that classifies the training data perfectly may
not lead to the tree with the best generalization performance.
- There may be noise in the training data the tree is fitting
- The algorithm might be making decisions based on
very little data.

•A hypothesis h is said to overfit the training data if there is

another hypothesis, h’, such that ‘h’ has smaller error than h’
on the training data but h has larger error on the test data than
h’.
Decision Trees
Evaluation Methods for Decision
Trees
• Two basic approaches
- Pre-pruning: Stop growing the tree at some point during
construction when it is determined that there is not enough
data to make reliable choices.
- Post-pruning: Grow the full tree and then remove nodes
that seem not to have sufficient evidence.
• Methods for evaluating subtrees to prune:
- Cross-validation: Reserve hold-out set to evaluate utility.
- Statistical testing: Test if the observed regularity can be
dismissed as likely to be occur by chance.
- Minimum Description Length: Is the additional complexity of
the hypothesis smaller than remembering the exceptions ?
ID3 (Iterative Dichotomiser 3)

• Invented by J.Ross Quinlan in 1975.

• Usedto generate a decision tree from a given data set by
employing a top-down, greedy search, to test each attribute at
every node of the tree.
• The resulting tree is used to classify future samples.
Attributes Classes
Gender Car Travel Cost Income Transportation
Ownership Level
Training Set

Male 0 Cheap Low Bus

Male 1 Cheap Medium Bus
Female 1 Cheap Medium Train
Female 0 Cheap Low Bus
Male 1 Cheap Medium Bus
Male 0 Standard Medium Train
Female 1 Standard Medium Train
Female 1 Expensive High Car
Male 2 Expensive Medium Car
Female 2 Expensive High Car
Algorithm

• Calculate the Entropy of every attribute using the data set.

• Split
the set into subsets using the attribute for which entropy is
minimum (or, equivalently, information gain is maximum).
• Make a decision tree node containing that attribute.
• Recurse on subsets using remaining attributes.
Entropy
• In
order to define Information Gain precisely, we need to discuss
Entropy first.
•A formula to calculate the homogeneity of a sample.
•A completely homogeneous sample has entropy of 0 (leaf node).
• An equally divided sample has entropy of 1.
• The formula for entropy is:
Entropy(S) = −𝑝(𝐼) log 2 𝑝(𝐼)
• where p(I) is the proportion of S belonging to class I. ∑ is over
total outcomes.
Entropy
• Example 1
If S is a collection of 14 examples with 9 YES and 5 NO
examples then
Entropy(S) = - (9/14) Log2 (9/14) - (5/14) Log2 (5/14)
= 0.940
Information Gain

• The information gain is based on the decrease in entropy after a

dataset is split on an attribute.
• The formula for calculating information gain is:
Gain(S, A) = Entropy(S) - ((|Sv| / |S|) * Entropy(Sv))
Where, Sv = subset of S for which attribute A has value v
• |Sv| = number of elements in Sv
• |S| = number of elements in S
Procedure

• First the entropy of the total dataset is calculated.

• The dataset is then split on the different attributes.
• The entropy for each branch is calculated.
• Then it is added proportionally, to get total entropy for the split.
• The resulting entropy is subtracted from the entropy before the
split.
• The result is the Information Gain, or decrease in entropy.
• The attribute that yields the largest IG is chosen for the decision
node.
Our Problem :
Attributes Classes
Gender Car Travel Cost Income Transportation
Ownership Level
Training Set

Male 0 Cheap Low Bus

Probability
Bus : 4/10 = 0.4
Train: 3/10 = 0.3
Car:3/10 = 0.3

E(S) = -P(I) log2 P(I)

E(S) = -(0.4) log2 (0.4) – (0.3)log2 (0.3) – (0.3)log2 (0.3) = 1.571
Split the dataset on ‘Gender’ attribute
Gain(S,A) = E(S) – I(S,A)
I(S,A) = 1.522 *(5/10) +
1.371*(5/10)

Gain(S,A) = 1.571 – 1.447

= 0.12

Probability Probability
Bus : 3/5 = 0.6 Bus : 1/5 = 0.2
Train: 1/5 = 0.2 Train: 2/5 = 0.4
Car:1/5 = 0.2 Car:2/5 = 0.4

E(Sv) = -0.6 log2 (0.6) – 0.2 log2 (0.2)- 0.2 log2 (0.2) E(Sv) = -0.2 log2 (0.2) – 0.4 log2 (0.4)- 0.4 log2 (0.4)
= 1.522 = 1.371
Split the dataset on ‘ownership’ attribute

Gain(S,A) = E(S) – I(S,A)

I(S,A) = 0.918 *(3/10) +
1.522*(5/10) + 0*(2/10)

Gain(S,A) = 1.571 – 1.0364

= 0.534
Probability Probability Probability
Bus : 2/3= 0.6 Bus : 2/5 = 0.4 Bus : 0/2 = 0
Train: 1/3 = 0.3 Train: 2/5 = 0.4 Train: 0/2 = 0
Car:0/3 = 0 Car:1/5 = 0.2 Car:2/2 = 1
Entropy = 0.918 Entropy = 1.522 Entropy = 0
• If we choose Travel Cost as splitting attribute,
-Entropy for Cheap = 0.722
Standard = 0
Expensive = 0
IG = 1.21

• If we choose Income Level as splitting attribute,

-Entropy for Low = 0
Medium = 1.459
High = 0
IG = 0.695
Attribute Information
Gain
Gender 0.125
Car 0.534
Ownership
Travel Cost 1.21
Income Level 0.695
Diagram : Decision Tree
Travel
Cost

Cheap Standrd Xpensiv

?
Train Car
Iteration on Subset of Training Set
Probability
Bus : 4/5 = 0.8
Train: 1/5 = 0.2
Car:0/5 = 0

E(S) = -P(I) log2 P(I)

E(S) = -(0.8) log2 (0.8) – (0.2)log2 (0.2) = 0.722
• If we choose Gender as splitting attribute,
-Entropy for Male = 0
Female = 1
IG = 0.322
• If we choose Car Ownership as splitting attribute,
-Entropy for 0 = 0
1 = 0.918
IG = 0.171
• If we choose Income Level as splitting attribute,
-Entropy for Low = 0
Medium = 0.918
IG = 0.171
Attributes Information
Gain
Gender 0.322
Car 0.171
Ownership
Income Level 0.171
Diagram : Decision Tree
Travel
Cost

Cheap Standrd Xpensiv

Male Female Train Car

Bus 0 1

Train Bus
Solution to Our Problem :

Name Gender Car Travel Cost Income Transportat

Ownership Level ion
Abhi Male 1 Standard High Train
Pavi Male 0 Cheap Medium Bus
Ammu Female 1 Cheap High Bus
Drawbacks of ID3
• Data may be over-fitted or over-classified, if a small sample is
tested.
• Only one attribute at a time is tested for making a decision.
• Classifying continuous data may be computationally
expensive.
• Does not handle numeric attributes and missing values.
C4.5 Algorithm
• This algorithm is also invented by J.Ross Quinlan., for generating decision
tree.
• It can be treated as an extension of ID3.
• C4.5 is mainly used for classification of data.
• It is called statistical classifier because of its potential for handling noisy
data.
Construction
• Base cases are determined.
• Foreach attribute a, the normalized information gain is found
out by splitting on a.
• Leta-best be the attribute with the highest normalized
information gain.
• Create a decision node that split on a-best
• Recur on the sub-lists obtained by splitting on a-best, and add
those nodes as children of node.
• The new Features include,

(i) accepts both continuous and discrete features.

(ii) handles incomplete data points.
(iii) solves over-fitting problem by bottom-up technique
usually known as "pruning”.
(iv) different weights can be applied the features that comprise
the training data.
Application based on features
• Continuous attribute categorisation.
discretisation – It partitions the continuous attribute values
into discrete set of intervals.
• Handling missing values.
Uses the theory of probability – probability is calculated from
the observed frequencies, takes the best probable value after
computing the frequencies.
Random Forest
• Random forest is an ensemble classifier which uses many
decision trees.
• It
builds multiple decision trees and merges them together to
get a more accurate and stable prediction.
• It can be used for both classification and regression problems.
Algorithm
1) Create a bootstrapped dataset.
2) Create a decision tree using the bootstrapped dataset, but
only use a random subset of variables(or columns) at each
step.
3) Go back to step 1 and repeat.

Using a bootstrapped sample and considering only a subset of

variables at each step results in a wide variety of trees.
Explanation
• First,
the process starts with the randomisation of the data sets
(done by bagging)
• Bagging – helps to reduce the variance, helps in effective
classification.
• RF contains many classification trees.
• Bagging– Bootstrapping the data plus using the aggregate to
make a decision.
• Out-of-Bag-Dataset – The dataset formed by using un chosen
tuples from the training set during the process of bootstrapping.
Gini Index
• Random Forest uses the gini index taken from the CART learning
system to construct decision trees. The gini index of node
impurity is the measure most commonly chosen for classification-
type problems. If a dataset T contains examples from n classes.
• gini index, Gini(T) is defined as:

Gini(T) = 1 - σ𝑛𝑗=1 𝑃𝑗 2

where pj is the relative frequency of class j in T

• If
a dataset T is split into two subsets T1 and T2 with sizes N1
and N2 respectively, the gini index of the split data contains
examples from n classes, the gini index (T) is defined as:

• Ginisplit(T) = N1/N gini(T1) + N2/N gini(T2)

Flow Chart of Algorithm
Advantages
• It produces a highly accurate classifier and learning is fast
• It runs efficiently on large data bases.
• It
can handle thousands of input variables without variable
deletion.
• Itcomputes proximities between pairs of cases that can be
used in clustering, locating outliers or (by scaling) give
interesting views of the data.
• Itoffers an experimental method for detecting variable
interactions.

Unit 5. Decision Trees
No ratings yet
Unit 5. Decision Trees
58 pages
Decision Tree Learning
No ratings yet
Decision Tree Learning
70 pages
Decision Trees for Data Scientists
No ratings yet
Decision Trees for Data Scientists
61 pages
Cse 445 Lecture 8 Mma
No ratings yet
Cse 445 Lecture 8 Mma
107 pages
ML Lecture 13-14
No ratings yet
ML Lecture 13-14
33 pages
Unit 3
No ratings yet
Unit 3
81 pages
Decision Tree Basics
No ratings yet
Decision Tree Basics
30 pages
Unit6 - 2 Classification-Decision-Trees
No ratings yet
Unit6 - 2 Classification-Decision-Trees
36 pages
Module - 2 Decision Tree Learning
No ratings yet
Module - 2 Decision Tree Learning
79 pages
Decision Tree
No ratings yet
Decision Tree
71 pages
Trees
No ratings yet
Trees
78 pages
Lec4 - Decision Trees
No ratings yet
Lec4 - Decision Trees
43 pages
Decision Trees
No ratings yet
Decision Trees
34 pages
Module 5 Notes
No ratings yet
Module 5 Notes
8 pages
ML Unit 3 Notes-1
No ratings yet
ML Unit 3 Notes-1
118 pages
Module 3
No ratings yet
Module 3
102 pages
16-Decision Tree Classification Algorithm Advantages With Examples (Iterative Dichotomiser 3-ID3) - 22-03-2024
No ratings yet
16-Decision Tree Classification Algorithm Advantages With Examples (Iterative Dichotomiser 3-ID3) - 22-03-2024
83 pages
ID3 Algorithm: Abbas Rizvi CS157 B Spring 2010
No ratings yet
ID3 Algorithm: Abbas Rizvi CS157 B Spring 2010
19 pages
Module 3
No ratings yet
Module 3
101 pages
L3 - Decision Trees
No ratings yet
L3 - Decision Trees
28 pages
ML Unit 3 Notes
No ratings yet
ML Unit 3 Notes
117 pages
Decision Tree & Random Forest
No ratings yet
Decision Tree & Random Forest
41 pages
Understanding Decision Trees and ID3
No ratings yet
Understanding Decision Trees and ID3
20 pages
MLT UNIT-3 Notes
No ratings yet
MLT UNIT-3 Notes
35 pages
Chapter 5 2018 2019
No ratings yet
Chapter 5 2018 2019
5 pages
Unit 3
No ratings yet
Unit 3
90 pages
FALLSEM2024-25 BCSE209L TH VL2024250101598 2024-08-05 Reference-Material-I
No ratings yet
FALLSEM2024-25 BCSE209L TH VL2024250101598 2024-08-05 Reference-Material-I
31 pages
15 1 Random Forest and Decision Tree
No ratings yet
15 1 Random Forest and Decision Tree
66 pages
Improved ID3 Algorithm for Data Mining
No ratings yet
Improved ID3 Algorithm for Data Mining
5 pages
ID3 Algorithm & ROC Analysis
No ratings yet
ID3 Algorithm & ROC Analysis
51 pages
Decision Tree Learning Guide
No ratings yet
Decision Tree Learning Guide
33 pages
Unit II Part 1
No ratings yet
Unit II Part 1
62 pages
New Module 3 Part1
No ratings yet
New Module 3 Part1
69 pages
Chapter#03 Supervised Learning and Its Algorithms - III
No ratings yet
Chapter#03 Supervised Learning and Its Algorithms - III
29 pages
ML UNIT-2 Notes
No ratings yet
ML UNIT-2 Notes
15 pages
Lecture 4
No ratings yet
Lecture 4
74 pages
06-Classification Part1
No ratings yet
06-Classification Part1
44 pages
ML Lec5
No ratings yet
ML Lec5
7 pages
Decision Tree Basics for Data Scientists
No ratings yet
Decision Tree Basics for Data Scientists
61 pages
Unit 1 ML (DT)
No ratings yet
Unit 1 ML (DT)
24 pages
Wk. 5.2. Decision Trees (27.10.2020)
No ratings yet
Wk. 5.2. Decision Trees (27.10.2020)
57 pages
Unit 1 ML (NN& ML Techniques)
No ratings yet
Unit 1 ML (NN& ML Techniques)
40 pages
Unit 3
No ratings yet
Unit 3
46 pages
Decision-Tree Learning .
No ratings yet
Decision-Tree Learning .
29 pages
Decision Tree Algorithm: Comp328 Tutorial 1 Kai Zhang
No ratings yet
Decision Tree Algorithm: Comp328 Tutorial 1 Kai Zhang
25 pages
Decision Trees in AI and ML
No ratings yet
Decision Trees in AI and ML
18 pages
Module 3 DecisionTree Notes
100% (1)
Module 3 DecisionTree Notes
14 pages
T6 Decision Tree
No ratings yet
T6 Decision Tree
38 pages
Decision Tree
No ratings yet
Decision Tree
12 pages
Decision Tree Classifier & ID3 Guide
No ratings yet
Decision Tree Classifier & ID3 Guide
34 pages
AI&Ml-module 4 (Complete)
No ratings yet
AI&Ml-module 4 (Complete)
124 pages
AI&Ml-module 4 (Part 1)
No ratings yet
AI&Ml-module 4 (Part 1)
85 pages
Decision Tree
No ratings yet
Decision Tree
43 pages
Lesson 5 Decision Tree Learning
No ratings yet
Lesson 5 Decision Tree Learning
10 pages
UNIT - 5 - ID3 Algotithm (Good Slide)
No ratings yet
UNIT - 5 - ID3 Algotithm (Good Slide)
28 pages
Decision Trees & NLP Overview
No ratings yet
Decision Trees & NLP Overview
27 pages
Unit 3 (A) NGP
No ratings yet
Unit 3 (A) NGP
78 pages
Decision Trees-Lecture 9&10
No ratings yet
Decision Trees-Lecture 9&10
60 pages
Decision Tree - Associative Rule Mining
No ratings yet
Decision Tree - Associative Rule Mining
69 pages
Literature Review Tables - Updated (01) Paper
No ratings yet
Literature Review Tables - Updated (01) Paper
3 pages
Predictive Analytics Guide
No ratings yet
Predictive Analytics Guide
28 pages
Box Jenkins Method of Forecasting
No ratings yet
Box Jenkins Method of Forecasting
10 pages
Tmeporal Reasoning
No ratings yet
Tmeporal Reasoning
19 pages
Cross Validation
No ratings yet
Cross Validation
16 pages
Cross-Validation Eg
No ratings yet
Cross-Validation Eg
3 pages
Predictive, Descriptive and Prescriptive Models What They Are and How To Apply Them in Business
No ratings yet
Predictive, Descriptive and Prescriptive Models What They Are and How To Apply Them in Business
27 pages
An Introduction To Content Validity Index
No ratings yet
An Introduction To Content Validity Index
6 pages
Ee-3107 Group 1 - Literary Analysis
No ratings yet
Ee-3107 Group 1 - Literary Analysis
4 pages
Engineers' Guide to EURO CW617N
No ratings yet
Engineers' Guide to EURO CW617N
4 pages
SWE Review
No ratings yet
SWE Review
6 pages
Taberlet Et Al., 2007
No ratings yet
Taberlet Et Al., 2007
8 pages
Geography Grade VI CHP 2-Landform
No ratings yet
Geography Grade VI CHP 2-Landform
8 pages
Chinese 1000 Chinese Calligraphy Syllabus
No ratings yet
Chinese 1000 Chinese Calligraphy Syllabus
3 pages
Fire Safety Engineering: Ignition & Spread
No ratings yet
Fire Safety Engineering: Ignition & Spread
8 pages
Jean Rouch's Ethnographic Film Semiotics
No ratings yet
Jean Rouch's Ethnographic Film Semiotics
11 pages
Fourier Transform and Series Explained
No ratings yet
Fourier Transform and Series Explained
22 pages
Christchurch Earthquake 2011 Impact Analysis
No ratings yet
Christchurch Earthquake 2011 Impact Analysis
1 page
The E-Myth Revisited
No ratings yet
The E-Myth Revisited
10 pages
(Ebook PDF) The Renewal of Education 1st edition by Rudolf Steiner, Robert Lathe, Nancy Parsons Whittaker, Eugene Schwartz 0880104554 â€Ž978-0880104555 full chapters - The latest updated ebook version is ready for download
100% (16)
(Ebook PDF) The Renewal of Education 1st edition by Rudolf Steiner, Robert Lathe, Nancy Parsons Whittaker, Eugene Schwartz 0880104554 â€Ž978-0880104555 full chapters - The latest updated ebook version is ready for download
86 pages
The Professional Engineer and The Civil Engineering Profession
No ratings yet
The Professional Engineer and The Civil Engineering Profession
17 pages
Chloretone: Structure and Uses
No ratings yet
Chloretone: Structure and Uses
22 pages
Writing 8
No ratings yet
Writing 8
76 pages
Ke & Pe Practice
No ratings yet
Ke & Pe Practice
2 pages
Decision-Making for Managers
No ratings yet
Decision-Making for Managers
26 pages
Rocks & Minerals for Students
No ratings yet
Rocks & Minerals for Students
12 pages
Total Productive Maintenance in Manufacturing
No ratings yet
Total Productive Maintenance in Manufacturing
10 pages
Establishment of Creative Industries A Panacea For Rural Women Empowerment in Oru East Local Government Area, Imo State, Nigeria
No ratings yet
Establishment of Creative Industries A Panacea For Rural Women Empowerment in Oru East Local Government Area, Imo State, Nigeria
10 pages
Uáæ Äãt Pàäraiàää À Ãgàä Àävàäû Éê Àäð®å E Ásé
No ratings yet
Uáæ Äãt Pàäraiàää À Ãgàä Àävàäû Éê Àäð®å E Ásé
6 pages
Housekeeping NC II: Laundry Training Guide
No ratings yet
Housekeeping NC II: Laundry Training Guide
11 pages
AISC 341-16 - Moment Frames Ver00
No ratings yet
AISC 341-16 - Moment Frames Ver00
5 pages
Coaching Guide for Managers
No ratings yet
Coaching Guide for Managers
4 pages
Class1 Math Question Papers
No ratings yet
Class1 Math Question Papers
8 pages
Friendship Loss and Reflection
No ratings yet
Friendship Loss and Reflection
2 pages
6 105 BS Computer Science 1st 1
No ratings yet
6 105 BS Computer Science 1st 1
2 pages
CE240 Trip Generation Victoria Ronald
No ratings yet
CE240 Trip Generation Victoria Ronald
2 pages
AI Graduate Certificate Course Guide
No ratings yet
AI Graduate Certificate Course Guide
4 pages