0% found this document useful (0 votes)
3 views7 pages

Data Mining Practical 8

The document outlines an experiment on Decision Tree Learning, focusing on its definition, construction, and the ID3 algorithm. It explains how decision trees can be used to make decisions based on various attributes and provides a detailed example of using ID3 to classify whether to play baseball based on weather conditions. Additionally, it includes an exercise for constructing a decision tree using a customer database for credit card approval.

Uploaded by

akhilpapa303
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views7 pages

Data Mining Practical 8

The document outlines an experiment on Decision Tree Learning, focusing on its definition, construction, and the ID3 algorithm. It explains how decision trees can be used to make decisions based on various attributes and provides a detailed example of using ID3 to classify whether to play baseball based on weather conditions. Additionally, it includes an exercise for constructing a decision tree using a customer database for credit card approval.

Uploaded by

akhilpapa303
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 7

Shree Swaminarayan Institute of Technology CE DEPT.

(VI SEMESTER)

Date:
Student Name:
Student Enrollment No:

EXPERIMENT NO: 8

TITLE: Decision Tree Learning.


OBJECTIVE:On completion of this exercise student will able to know about…

 What is Decision tree?


 Basic concept about decision tree.
 How to construct Decision Tree?
 Decision tree algorithm (ID3)

THEORY:

What is decision Tree?

Imagine you only ever do four things at the weekend: go shopping, watch a movie, play tennis or just stay
in. What you do depends on three things: the weather (windy, rainy or sunny); how much money you
have (rich or poor) and whether your parents are visiting. You say to your yourself: if my parents are
visiting, we'll go to the cinema. If they're not visiting and it's sunny, then I'll play tennis, but if it's windy,
and I'm rich, then I'll go shopping. If they're not visiting, it's windy and I'm poor, then I will go to the
cinema. If they're not visiting and it's rainy, then I'll stay in.
To remember all this, you draw a flowchart which will enable you to read off your decision. We call such
diagrams decision trees. A suitable decision tree for the weekend decision choices would be as follows:

Figure 1
We can see why such diagrams are called trees, because, while they are admittedly upside down, they
start from a root and have branches leading to leaves (the tips of the graph at the bottom). Note that the

Student Name (Enrollment No) Page no


Shree Swaminarayan Institute of Technology CE DEPT. (VI SEMESTER)

leaves are always decisions, and a particular decision might be at the end of multiple branches (for
example, we could choose to go to the cinema for two different reasons).
Armed with our decision tree, on Saturday morning, when we wake up, all we need to do is check (a) the
weather (b) how much money we have and (c) whether our parent's car is parked in the drive. The
decision tree will then enable us to make our decision. Suppose, for example, that the parents haven't
turned up and the sun is shining. Then this path through our decision tree will tell us what to do:

Figure 2

and hence we run off to play tennis because our decision tree told us to. Note that the decision tree covers
all eventualities. That is, there are no values that the weather, the parents turning up or the money
situation could take which aren't catered for in the decision tree. Note that, in this lecture, we will be
looking at how to automatically generate decision trees from examples, not at how to turn thought
processes into decision trees.

Reading Decision Trees

There is a link between decision tree representations and logical representations, which can be exploited
to make it easier to understand (read) learned decision trees. If we think about it, every decision tree is
actually a disjunction of implications (if ... then statements), and the implications are Horn clauses: a
conjunction of literals implying a single literal. In the above tree, we can see this by reading from the root
node to each leaf node:

If the parents are visiting, then go to the cinema


Or
if the parents are not visiting and it is sunny, then play tennis
Or
If the parents are not visiting and it is windy and you're rich, then go shopping
Or
Student Name (Enrollment No) Page no
Shree Swaminarayan Institute of Technology CE DEPT. (VI SEMESTER)

If the parents are not visiting and it is windy and you're poor, then go to cinema
Or
If the parents are not visiting and it is rainy, then stay in.

Of course, this is just a re-statement of the original mental decision making process we described.
Remember, however, that we will be programming an agent to learn decision trees from example, so this
kind of situation will not occur as we will start with only example situations. It will therefore be important
for us to be able to read the decision tree the agent suggests.

ID3 Algorithm

Start from root node of decision tree, testing the attribute specified by thisnode, then moving down the
tree branch according to the attribute value in thegiven set. This process is the repeated at the sub-tree
level.

What is decision tree learning algorithm suited for:

1. Instance is represented as attribute-value pairs. For example, attribute 'Temperature' and its value
'hot', 'mild', 'cool'. We are also concerning to extend attribute -value to continuous-valued data
(numeric attribute value) in our project.
2. The target function has discrete output values. It can easily deal with instance which is assigned to
a Boolean decision, such as 'true' and 'false', 'p(positive)' and 'n(negative)'. Although it is possible
to extend target to realvalued outputs, we will cover the issue in the later part of this report.
3. The training data may contain errors. This can be dealt with pruning techniques that we will not
cover here.

The 3 widely used decision tree learning algorithms are: ID3, ASSISTANTand C4.5. We will cover ID3
in this experiment.

Decision tree learning is attractive for 3 reasons: (Paul Utgoff& Carla Brodley, 1990)

1. Decision tree is a good generalization for unobserved instance, only if the instances are described
in terms of features that are correlated with the target concept.
2. The methods are efficient in computation that is proportional to the number of observed training
instances.
3. The resulting decision tree provides a representation of the concept that appeal to human because
it renders the classification process self-evident.
ID3 is a no incremental algorithm, meaning it derives its classes from a fixed set of training instances. An
incremental algorithm revises the current concept definition, if necessary, with a new sample. The classes
created by ID3 are inductive, that is, given a small set of training instances, the specific classes created by
ID3 are expected to work for all future instances. The distribution of the unknowns must be the same as
the test cases. Induction classes cannot be proven to work in every case since they may classify an infinite
number of instances. Note that ID3 (or any inductive algorithm) may misclassify data.

Data Description

The sample data used by ID3 has certain requirements, which are:
 Attribute-value description - the same attributes must describe each example and have a fixed
number of values.

Student Name (Enrollment No) Page no


Shree Swaminarayan Institute of Technology CE DEPT. (VI SEMESTER)

 Predefined classes - an example's attributes must already be defined, that is, they are not learned
by ID3.
 Discrete classes - classes must be sharply delineated. Continuous classes broken up into vague
categories such as a metal being "hard, quite hard, flexible, soft, quite soft" are suspect.
 Sufficient examples - since inductive generalization is used (i.e. not provable) there must be
enough test cases to distinguish valid patterns from chance occurrences.

Attribute Selection

1. How to find entropy?

How does ID3 decide which attribute is the best? A statistical property, called information gain, is
used. Gain measures how well a given attribute separates training examples into targeted classes.
The one with the highest information (information being the most useful for classification) is
selected. In order to define gain, we first borrow an idea from information theory called entropy.
Entropy measures the amount of information in an attribute.

Given a collection S of c outcomes

Where
pi is the proportion of S belonging to class I.
S is over c.
Log2 is log base 2.
Note that S is not an attribute but the entire sample set.

Example 1

If S is a collection of 14 examples with 9 YES and 5 NO examples then

Notice entropy is 0 if all members of S belong to the same class (the data is perfectly classified).
The range of entropy is 0 ("perfectly classified") to 1 ("totally random").

2. How to find Gain?

Two step process for find gain


a. Find entropy of particular attribute A. i.e. Entropy(S)
b. Find entropy of whole dataset. i.e. EntropyA(S)

Gain(S, A) is information gain of example set S on attribute A is defined as

Where:
Student Name (Enrollment No) Page no
Shree Swaminarayan Institute of Technology CE DEPT. (VI SEMESTER)

S is each value v of all possible values of attribute A


Sv = subset of S for which attribute A has value v
|Sv| = number of elements in Sv
|S| = number of elements in S

Example 2

Suppose S is a set of 14 examples in which one of the attributes is wind speed. The values of
Wind can be Weak or Strong. The classification of these 14 examples are 9 YES and 5 NO. For
attribute Wind, suppose there are 8 occurrences of Wind = Weak and 6 occurrences of Wind =
Strong. For Wind = Weak, 6 of the examples are YES and 2 are NO. For Wind = Strong, 3 are
YES and 3 are NO. Therefore

Entropy(Sweak) = - (6/8)*log2(6/8) - (2/8)*log2(2/8) = 0.811

Entropy(Sstrong) = - (3/6)*log2(3/6) - (3/6)*log2(3/6) = 1.00

= 0.940 - (8/14)*0.811 - (6/14)*1.00


= 0.048

For each attribute, the gain is calculated and the highest gain is used in the decision node.

Example of ID3 algorithm

Suppose we want ID3 to decide whether the weather is amenable to playing baseball. Over the course of 2
weeks, data is collected to help ID3 build a decision tree (see table 1).

The target classification is "should we play baseball?" which can be yes or no.

The weather attributes are outlook, temperature, humidity, and wind speed. They can have the following
values:

outlook = { sunny, overcast, rain }


temperature = {hot, mild, cool }
humidity = { high, normal }
wind = {weak, strong }

Examples of set S are:

Table 1

Day Outlook Temperature Humidity Wind Play ball


D1 Sunny Hot High Weak No
D2 Sunny Hot High Strong No
D3 Overcast Hot High Weak Yes
Student Name (Enrollment No) Page no
Shree Swaminarayan Institute of Technology CE DEPT. (VI SEMESTER)

D4 Rain Mild High Weak Yes


D5 Rain Cool Normal Weak Yes
D6 Rain Cool Normal Strong No
D7 Overcast Cool Normal Strong Yes
D8 Sunny Mild High Weak No
D9 Sunny Cool Normal Weak Yes
D10 Rain Mild Normal Weak Yes
D11 Sunny Mild Normal Strong Yes
D12 Overcast Mild High Strong Yes
D13 Overcast Hot Normal Weak Yes
D14 Rain Mild High Strong No

We need to find which attribute will be the root node in our decision tree. The gain is calculated for all
four attributes:

Gain(S, Outlook) = 0.246


Gain(S, Temperature) = 0.029
Gain(S, Humidity) = 0.151
Gain(S, Wind) = 0.048 (calculated in example 2)

Outlook attribute has the highest gain, therefore it is used as the decision attribute in the root node.
Since Outlook has three possible values, the root node has three branches (sunny, overcast, rain). The
next question is "what attribute should be tested at the Sunny branch node?" Since we=92ve used Outlook
at the root, we only decide on the remaining three attributes: Humidity, Temperature, or Wind.

Ssunny = {D1, D2, D8, D9, D11} = 5 examples from table 1 with outlook = sunny
Gain(Ssunny, Humidity) = 0.970
Gain(Ssunny, Temperature) = 0.570
Gain(Ssunny, Wind) = 0.019

Humidity has the highest gain; therefore, it is used as the decision node. This process goes on until all
data is classified perfectly or we run out of attributes.

Student Name (Enrollment No) Page no


Shree Swaminarayan Institute of Technology CE DEPT. (VI SEMESTER)

The final decision tree

The decision tree can also be expressed in rule format:

IF outlook = sunny AND humidity = high THEN playball = no


IF outlook = rain AND humidity = high THEN playball = no
IF outlook = rain AND wind = strong THEN playball = yes
IF outlook = overcast THEN playball = yes
IF outlook = rain AND wind = weak THEN playball = yes

ID3 has been incorporated in a number of commercial rule-induction packages. Some specific
applications include medical diagnosis, credit risk assessment of loan applications, equipment
malfunctions by their cause, classification of soybean diseases, and web search classification.

EXCERSICE:

Consider the customer database described below where an application for a credit card is either approved
or rejected. Construct a decision tree (with Approved as the decision variable) using the entropy measure.

Case Income Own Age Years of Approved


in $K home employment
1 >60 Own 35 <5 Yes
2 30-60 Own 35 >5 Yes
3 <30 Rent 35 >5 No
4 <30 Own 35 >5 Yes
5 30-60 Own 35 <5 No
6 >60 Rent 35 >5 Yes
7 <30 Rent 35 <5 No
8 30-60 Rent 35 >5 Yes
9 <30 Own 35 <5 No
10 >60 Own 35 >5 Yes
11 30-60 Rent 35 <5 No
12 >60 Rent 35 <5 Yes

EVALUATION:

Observation &
Timely completion Viva Total
Implementation
4 2 4 10

Subject In-charge Name & Signature: ____________


Date: ________________

Student Name (Enrollment No) Page no

You might also like