0% found this document useful (0 votes)

33 views55 pages

Decision Tree Induction Basics

Uploaded by

Sailaja

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

33 views55 pages

Decision Tree Induction Basics

Uploaded by

Sailaja

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 55

Department of

CSE
COURSE NAME: DATA
WAREHOUSE AND MINING
COURSE CODE: 22DSB3202
Topic:

Classification by Decision
Tree Induction

Session – 14
AIM OF THE
SESSION
An ability to understand the Decision tree and different types of classification.

INSTRUCTIONAL
OBJECTIVES
This Session is designed to:
1. Demonstrate Basic Learning Methods
2. Describe best attribute selection
3. Procedure of entropy information gain
4. Describe the concept of overfitting tree pruning

LEARNING OUTCOMES

At the end of this session, you should be able to:

1. Define different of methods of Classifications
2. Describe Algorithm for Decision tree Induction
3. Summarize the concept of Decision Tree induction
SESSION INTRODUCTION

Rule Induction:
Rule induction is a data mining process of
deducing if-then rules from a data set. These
symbolic decision rules explain an inherent
relationship between the attributes and class
labels in the data set. Many real-life
experiences are based on intuitive rule
induction. For example, we can proclaim a rule
that states “if it is 8 a.m. on a weekday, then
highway traffic will be heavy” and “if it is 8 p.m.
on a Sunday, then the traffic will be light.
SESSION INTRODUCTION

INTRODUCTION OF DECISION TREE INDUCTION

Decision Tree is a supervised learning method used in data mining
for classification and regression methods. It is a tree that helps us
in decision-making purposes. The decision tree creates
classification or regression models as a tree structure. It separates
a data set into smaller subsets, and at the same time, the decision
tree is steadily developed. The final tree is a tree with the decision
nodes and leaf nodes. A decision node has at least two branches.
The leaf nodes show a classification or decision. We can't
accomplish more split on leaf nodes-The uppermost decision node
in a tree that relates to the best predictor called the root node.
Decision trees can deal with both categorical and numerical data.
SESSION DESCRIPTION

Decision Tree is the most powerful

and popular tool for classification
and prediction. A Decision tree is a
flowchart-like tree structure, where
each internal node denotes a test on
an attribute, each branch represents
an outcome of the test, and each leaf
node (terminal node) holds a class
label.
SESSION DESCRIPTION
Short note on Decision Tree:-
•A decision tree which is also known as prediction tree refers a tree structure to mention the
sequences of decisions as well as consequences.
•Considering the input X = (X1, X2,… Xn), the aim is to predict a response or output variable Y.
•Each element in the set (X1, X2,…Xn) is known as input variable. It is possible to achieve the
prediction by the process of building a decision tree which has test points as well as branches.
•At each test point, it is decided to select a particular branch and traverse down the tree.
•Ultimately, a final point is reached, and it will be easy to make prediction.
•In a decision tree, all the test points exhibit testing specific input variables (or attributes), and the
developed decision tree is represented by the branches.
•Because of flexibility as well as simple visualization, decision trees are mostly probably deployed in
data mining applications for the purpose of classification.
•In the decision tree, the input values are considered as categorical or continuous.
•A structure of test points (known as nodes) and branches is established by the decision tree by
which the decision being made will be represented.
•Leaf node is the one which do not have further branches. The returning value of leaf nodes is class
labels while in some cases they return the probability scores.
•It is possible to convert decision tree into a set of decision rules.
•There are two types of Decision trees: classification trees and regression trees.
SESSION DESCRIPTION

•Classification trees are generally applied to output variables which are

categorical and mostly binary in nature, for example yes or no, sale or
not, and so on.
•Whereas regression trees are applied to output variables which are
numeric or continuous, for example predicted price of a consumer good.
•In variety of situations, it is possible to apply decision tree. It is easy to
represent them in a visual way, and the analogous straightforward.
•Also as the result is a sequence of logical if-then statements, there is no
any presence of underlying assumption regarding a linear or nonlinear
relationship between the input variables and the response variable
SESSION DESCRIPTION

An Illustrative Example (1/2)

Day Outlook Temperature Humidity Wind Play Tennis

D1 Sunny Hot High Weak No

D2 Sunny Hot High Strong No
D3 Overcast Hot High Weak Yes
D4 Rain Mild High Weak Yes
D5 Rain Cool Normal Weak Yes
D6 Rain Cool Normal Strong No
D7 Overcast Cool Normal Strong Yes
D8 Sunny Mild High Weak No
D9 Sunny Cool Normal Weak Yes
D10 Rain Mild Normal Weak Yes
D11 Sunny Mild Normal Strong Yes
D12 Overcast Mild High Strong Yes
D13 Overcast Hot Normal Weak Yes
D14 Rain Mild High Strong No
SESSION DESCRIPTION (Cont..)

Overfitting in Decision Trees

Consider adding noisy training example

<Sunny, Hot, Normal, Strong, PlayTennis = No>
What effect on earlier tree?
SESSION DESCRIPTION (Cont..)

Problem of Overfitting can be solved by selecting the best

attribute of split.
• So for that
Information we
gain need
is the the concept
expected reduction inof entropy
entropy causedand information
by partitioning the
gain.
examples on an attribute.
• The higher the information gain the more effective the attribute in
classifying training data.
• Expected reduction in entropy knowing A

Gain(S, A) = Entropy(S) − Entropy(Sv)

v  Values(A)

Values(A) possible values for A

Sv subset of S for which A has value v

SESSION DESCRIPTION (Cont..)

Concept of Entropy
If a point represents a gas molecule,
then which system has the more
entropy?

How to measure? ?

More ordered Less ordered

less entropy higher entropy
More organized or Less organized or ordered
(less probable) disordered (more probable )
Following figure shows three possibilities
for partitioning tuples based on the
splitting criterion, each with examples.
A is discrete-valued
A is continuous-valued
A is discrete-valued and a binary
tree must be produced
12/03/2024

ENTROPY AND INFORMATION THEORY

• Entropy specifies the number the average length (in bits) of the message needed to
transmit the outcome of a random variable. This depends on the probability distribution.

• Optimal length code assigns  log2 p bits to messages with probability p. Most probable
messages get shorter codes.
• Example: 8-sided [unbalanced] die
1 2 3 4 5 6 7 8
4/16 4/16 2/16 2/16 1/16 1/16 1/16 1/16
2 bits 2 bits 3 bits 3 bits 4bits 4bits 4bits 4bits
E = (1/4 log2 4)  2 + (1/8 log2 8)  2 + (1/16 log2 16)  4 = 1+3/4+1 = 2,75

Maria Simi
12/03/2024

INFORMATION GAIN AS ENTROPY

REDUCTION
• Information gain is the expected reduction in entropy caused by partitioning the
examples on an attribute.
• The higher the information gain the more effective the attribute in classifying
training data.
• Expected reduction in entropy knowing A
|Sv|
Gain(S, A) = Entropy(S) − Entropy(Sv) |S|
v  Values(A)

Values(A) possible values for A

Sv subset of S for which A has value v

Maria Simi
12/03/2024

EXAMPLE: EXPECTED INFORMATION GAIN

• Let
• Values(Wind) = {Weak, Strong}
• S = [9+, 5−]
• SWeak = [6+, 2−]
• SStrong = [3+, 3−]
• Information gain due to knowing Wind:
Gain(S, Wind) = Entropy(S) − 8/14 Entropy(SWeak) − 6/14 Entropy(SStrong)

= 0,94 − 8/14  0,811 − 6/14  1,00

= 0,048

Maria Simi
12/03/2024

WHICH ATTRIBUTE IS THE BEST CLASSIFIER?

Maria Simi
EXAMPLE
12/03/2024

Maria Simi
12/03/2024

FIRST STEP: WHICH ATTRIBUTE TO TEST AT THE ROOT?

• Which attribute should be tested at the root?

• Gain(S, Outlook) = 0.246
• Gain(S, Humidity) = 0.151
• Gain(S, Wind) = 0.084
• Gain(S, Temperature) = 0.029
• Outlook provides the best prediction for the target
• Lets grow the tree:
• add to the tree a successor for each possible value of Outlook
• partition the training samples according to the value of Outlook

Maria Simi
12/03/2024

AFTER FIRST STEP

Maria Simi
12/03/2024

SECOND STEP

 Working on Outlook=Sunny node:

Gain(SSunny, Humidity) = 0.970  3/5  0.0  2/5  0.0 = 0.970
Gain(SSunny, Wind) = 0.970  2/5  1.0  3.5  0.918 = 0 .019
Gain(SSunny, Temp.) = 0.970  2/5  0.0  2/5  1.0  1/5  0.0 = 0.570

 Humidity provides the best prediction for the target

 Lets grow the tree:
 add to the tree a successor for each possible value of Humidity
 partition the training samples according to the value of Humidity

Maria Simi
SECOND AND THIRD 12/03/2024

STEPS

Maria Simi
Calculate Entropy
ACTIVITIES/ CASE STUDIE/ IMPORTANT FACTS RELATED TO
THE SESSION
CASE STUDY
Induction of a decision tree using
information gain
presents a training set, D, of class-labeled
tuples randomly selected from the All-
Electronics customer database.
The class label attribute, buys computer, has
two distinct values (namely, {yes, no}).
therefore, there are two distinct classes (i.e., m
D 2). Let class C1 correspond to yes and class
C2 correspond to no.
There are nine tuples of class yes and five
tuples of class no.
EXAMPLES

Cont.
A (root) node N is created for the tuples in D.
To find the splitting criterion for these tuples, we must compute the
information gain of each attribute.

Now, put required values-

Next, we need to compute the expected information requirement for

each attribute.
Let’s start with the attribute age.
The expected information needed to classify a tuple in D if the tuples are
partitioned according to age is-

Hence, the gain in information from such a partitioning would be-

Similarly, we can compute Gain.income/ D 0.029 bits, Gain.student/ D

0.151 bits, and Gain.credit rating/ D 0.048 bits.
SUMMARY

Because age has the highest information gain

among the attributes, it is selected as the
splitting attribute.
Node N is labeled with age, and branches are
grown for each of the attribute’s values.
The tuples falling into the partition for age =
middle_ aged all belong to the same class.
Note: The attribute age has the highest
information gain and therefore becomes the
splitting attribute at the root node of the
decision tree. Branches are grown for each
outcome of age. The tuples are shown
partitioned accordingly.
Tree Pruning
Algorithm for Decision
tree Induction
Inductive inference with decision trees

Decision Trees is one of the most

widely used and practical
methods of inductive inference
Features
Method for approximating discrete-
valued functions (including
boolean)
Learned functions are represented
as decision trees (or if-then-else
rules)
Expressive hypotheses space,
including disjunction
Robust to noisy data
SELF-ASSESSMENT QUESTIONS

1. … A _________ is a decision support tool that uses a tree-like graph or model of

decisions and their possible consequences, including chance event outcomes,
resource costs, and utility.
a) Decision tree
b) Graphs
c) Trees
d) Neural Networks

2. Choose from the following that are Decision Tree nodes?

(a) a) Decision
Nodes
b) End Nodes
c) Chance Nodes
d) All of the
mentioned
TERMINAL QUESTIONS

1. Describe decision tree induction

2. List out the advantages and disadvantages of decision tree

3. Analyze and illustrate decision tree algorithim with example

4. Summarize the concept of entropy ,information gain and gain ratio .

REFERENCES FOR FURTHER LEARNING OF THE
SESSION
Reference Books:
• Han J & Kamber M, “Data Mining: Concepts and Techniques”, Third Edition, Elsevier, 2011.
• https://www.upgrad.com/blog/data-mining-techniques/
• https://www.javatpoint.com/data-mining-techniques
• https://www.datasciencecentral.com/profiles/blogs/the-7-most-important-data-
mining-techniques
• https://onix-Classifications.com/blog/8-data-mining-techniques-you-must-learn-to-
succeed-in-business
• https://www.infogix.com/top-5-data-mining-techniques/
Sites and Web links:
1. https://www.geeksforgeeks.org/data-mining/
2. https://www.javatpoint.com/data-mining
3. https://www.springboard.com/blog/data-science/data-mining/
4. https://onlinecourses.nptel.ac.in/noc21_cs06/preview
5. https://www.codingninjas.com/codestudio/library/rule-based-classification-in-data-
mining
THANK YOU

Team – Course Name

FALLSEM2024-25 BCSE209L TH VL2024250101598 2024-08-05 Reference-Material-I
No ratings yet
FALLSEM2024-25 BCSE209L TH VL2024250101598 2024-08-05 Reference-Material-I
31 pages
07 - ML - Decision Tree
No ratings yet
07 - ML - Decision Tree
37 pages
Decision Tree Basics for Data Scientists
No ratings yet
Decision Tree Basics for Data Scientists
61 pages
Unit 5. Decision Trees
No ratings yet
Unit 5. Decision Trees
58 pages
Unit6 - 2 Classification-Decision-Trees
No ratings yet
Unit6 - 2 Classification-Decision-Trees
36 pages
Decision Tree Learning Guide
No ratings yet
Decision Tree Learning Guide
79 pages
L5 - Decision Tree - B
No ratings yet
L5 - Decision Tree - B
51 pages
Decision Trees for Data Scientists
No ratings yet
Decision Trees for Data Scientists
75 pages
ML UNIT-2 Notes
No ratings yet
ML UNIT-2 Notes
15 pages
Decision Tree Learning
No ratings yet
Decision Tree Learning
70 pages
Decision-Tree Learning .
No ratings yet
Decision-Tree Learning .
29 pages
Decision Tree
No ratings yet
Decision Tree
5 pages
Module 3
No ratings yet
Module 3
101 pages
Decision Tree Learning Guide
No ratings yet
Decision Tree Learning Guide
33 pages
Decision Trees for Data Scientists
No ratings yet
Decision Trees for Data Scientists
14 pages
Module 3
No ratings yet
Module 3
102 pages
Module - 2 Decision Tree Learning
No ratings yet
Module - 2 Decision Tree Learning
79 pages
Unit 3
No ratings yet
Unit 3
90 pages
Understanding Decision Trees in AI
No ratings yet
Understanding Decision Trees in AI
61 pages
Decision Tree & Random Forest
No ratings yet
Decision Tree & Random Forest
41 pages
06 Classification Decision Tree
No ratings yet
06 Classification Decision Tree
42 pages
MLT UNIT-3 Notes
No ratings yet
MLT UNIT-3 Notes
35 pages
Class 16 Decision Tree
No ratings yet
Class 16 Decision Tree
45 pages
Classification With Decision Trees: Instructor: Qiang Yang
100% (1)
Classification With Decision Trees: Instructor: Qiang Yang
62 pages
T6 Decision Tree
No ratings yet
T6 Decision Tree
38 pages
Decision Tree Algorithm: Comp328 Tutorial 1 Kai Zhang
No ratings yet
Decision Tree Algorithm: Comp328 Tutorial 1 Kai Zhang
25 pages
Decision Tree
No ratings yet
Decision Tree
42 pages
Decision Trees
No ratings yet
Decision Trees
15 pages
Decision Trees for Play Tennis Analysis
No ratings yet
Decision Trees for Play Tennis Analysis
51 pages
Unit 3
No ratings yet
Unit 3
81 pages
Decision Trees for CS Students
No ratings yet
Decision Trees for CS Students
54 pages
Cse 445 Lecture 8 Mma
No ratings yet
Cse 445 Lecture 8 Mma
107 pages
Decision Tree
No ratings yet
Decision Tree
12 pages
Module 5 Notes
No ratings yet
Module 5 Notes
8 pages
DT-0 (3 Files Merged)
No ratings yet
DT-0 (3 Files Merged)
143 pages
ML Lec5
No ratings yet
ML Lec5
7 pages
Chapter 5 2018 2019
No ratings yet
Chapter 5 2018 2019
5 pages
Unit 1 ML (NN& ML Techniques)
No ratings yet
Unit 1 ML (NN& ML Techniques)
40 pages
M01 Tree-Based Methods
No ratings yet
M01 Tree-Based Methods
38 pages
Decision Tree
No ratings yet
Decision Tree
35 pages
Understanding Classification and Decision Trees
No ratings yet
Understanding Classification and Decision Trees
80 pages
Unit 1 ML (DT)
No ratings yet
Unit 1 ML (DT)
24 pages
Decision Trees for Data Scientists
No ratings yet
Decision Trees for Data Scientists
61 pages
Unit 6 Finalized
No ratings yet
Unit 6 Finalized
30 pages
ML Unit 3 Notes-1
No ratings yet
ML Unit 3 Notes-1
118 pages
L3 - Decision Trees
No ratings yet
L3 - Decision Trees
28 pages
Decision - Tree
No ratings yet
Decision - Tree
75 pages
Data Mining Mini Projrct
No ratings yet
Data Mining Mini Projrct
16 pages
Decision Tree Algorithm Guide
No ratings yet
Decision Tree Algorithm Guide
25 pages
2c Decision Tree Algorithm
No ratings yet
2c Decision Tree Algorithm
21 pages
Decision Tree
No ratings yet
Decision Tree
20 pages
Decision Tree Example
No ratings yet
Decision Tree Example
21 pages
Lect 8-Decision Tree-2
No ratings yet
Lect 8-Decision Tree-2
16 pages
7 DecisionTree
No ratings yet
7 DecisionTree
58 pages
Decision Trees
No ratings yet
Decision Trees
25 pages
Geometric Intuition of Decision Trees
No ratings yet
Geometric Intuition of Decision Trees
7 pages
Unit 3.2 Decision Tree Algorithm Wit Examples
No ratings yet
Unit 3.2 Decision Tree Algorithm Wit Examples
85 pages
Web Design Module
0% (1)
Web Design Module
9 pages
Power Operation 2022 System Guide
No ratings yet
Power Operation 2022 System Guide
1,334 pages
Python Cheat Sheet: Pandas - Numpy - Sklearn Matplotlib - Seaborn BS4 - Selenium - Scrapy
100% (4)
Python Cheat Sheet: Pandas - Numpy - Sklearn Matplotlib - Seaborn BS4 - Selenium - Scrapy
11 pages
E7E01 Introduction 02
No ratings yet
E7E01 Introduction 02
50 pages
Installing Brushes in Photoshop CS3
No ratings yet
Installing Brushes in Photoshop CS3
15 pages
MIS Draft Assignment
100% (1)
MIS Draft Assignment
25 pages
Rational Approximation: Lloyd N. Trefethen January 3, 2025
No ratings yet
Rational Approximation: Lloyd N. Trefethen January 3, 2025
4 pages
Lab #2 - Assessment Worksheet
No ratings yet
Lab #2 - Assessment Worksheet
3 pages
Mo Et Al 2024-Machine Learning
No ratings yet
Mo Et Al 2024-Machine Learning
8 pages
K 14slr PDF
No ratings yet
K 14slr PDF
49 pages
Installation Guide - VIGI Wired NVR - REV1.0.0
No ratings yet
Installation Guide - VIGI Wired NVR - REV1.0.0
2 pages
ABS Class II Operator Manual
No ratings yet
ABS Class II Operator Manual
64 pages
Importance of File Handling in Programming: Text Files Binary Files Text Files
No ratings yet
Importance of File Handling in Programming: Text Files Binary Files Text Files
7 pages
MPD Article How To Measure Partial Discharge 2020 ENU
No ratings yet
MPD Article How To Measure Partial Discharge 2020 ENU
2 pages
Web Development Project Report
No ratings yet
Web Development Project Report
55 pages
Understanding Asynchronous Transfer Mode
No ratings yet
Understanding Asynchronous Transfer Mode
28 pages
Hospital Management System (HMS) - Project Life Cycle
No ratings yet
Hospital Management System (HMS) - Project Life Cycle
10 pages
MANTIS Quick Start Guide - Non - RFID - V43R120
No ratings yet
MANTIS Quick Start Guide - Non - RFID - V43R120
2 pages
SQL and PeopleCode for Report Generation
No ratings yet
SQL and PeopleCode for Report Generation
4 pages
FRONT OFFICE GR 7-8 Q1 M2 Tools Equipt Paraphernalia
No ratings yet
FRONT OFFICE GR 7-8 Q1 M2 Tools Equipt Paraphernalia
13 pages
Brocade Education Registration Rev0223
No ratings yet
Brocade Education Registration Rev0223
24 pages
Mohamed Abuthahir: QA Automation Expert
No ratings yet
Mohamed Abuthahir: QA Automation Expert
1 page
Mi Smart Band 6
No ratings yet
Mi Smart Band 6
1 page
Request & Reply Letter Guide
No ratings yet
Request & Reply Letter Guide
15 pages
Computer Programming Exam 2022
No ratings yet
Computer Programming Exam 2022
2 pages
CS601 GDB Solution by ZB
No ratings yet
CS601 GDB Solution by ZB
9 pages
Rahul Kumar's Resume
No ratings yet
Rahul Kumar's Resume
1 page
Office 2010 Beta Installation Guide
No ratings yet
Office 2010 Beta Installation Guide
2 pages
Maintenance Manual
No ratings yet
Maintenance Manual
30 pages
IoT Smart Home Automation System
No ratings yet
IoT Smart Home Automation System
13 pages

Decision Tree Induction Basics

Uploaded by

Decision Tree Induction Basics

Uploaded by

Department of

At the end of this session, you should be able to:

INTRODUCTION OF DECISION TREE INDUCTION

Decision Tree is the most powerful

•Classification trees are generally applied to output variables which are

An Illustrative Example (1/2)

D1 Sunny Hot High Weak No

Overfitting in Decision Trees

Consider adding noisy training example

Problem of Overfitting can be solved by selecting the best

Gain(S, A) = Entropy(S) − Entropy(Sv)

Values(A) possible values for A

Sv subset of S for which A has value v

More ordered Less ordered

ENTROPY AND INFORMATION THEORY

INFORMATION GAIN AS ENTROPY

Values(A) possible values for A

Sv subset of S for which A has value v

EXAMPLE: EXPECTED INFORMATION GAIN

= 0,94 − 8/14  0,811 − 6/14  1,00

WHICH ATTRIBUTE IS THE BEST CLASSIFIER?

FIRST STEP: WHICH ATTRIBUTE TO TEST AT THE ROOT?

• Which attribute should be tested at the root?

AFTER FIRST STEP

 Working on Outlook=Sunny node:

 Humidity provides the best prediction for the target

Now, put required values-

Next, we need to compute the expected information requirement for

Hence, the gain in information from such a partitioning would be-

Similarly, we can compute Gain.income/ D 0.029 bits, Gain.student/ D

Because age has the highest information gain

Decision Trees is one of the most

1. … A _________ is a decision support tool that uses a tree-like graph or model of

2. Choose from the following that are Decision Tree nodes?

1. Describe decision tree induction

2. List out the advantages and disadvantages of decision tree

3. Analyze and illustrate decision tree algorithim with example

4. Summarize the concept of entropy ,information gain and gain ratio .

Team – Course Name

You might also like