0% found this document useful (0 votes)
263 views4 pages

C4.5 Decision Tree Solution With Calculations

ML

Uploaded by

tejashsr.23aid
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
263 views4 pages

C4.5 Decision Tree Solution With Calculations

ML

Uploaded by

tejashsr.23aid
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

C4.

5 Decision Tree Step-by-Step Calculations

Step 1: Entropy Calculation

The dataset contains 6 'Pass' results and 4 'Fail' results.

Total examples: 10

Entropy formula:

H(S) = -(p_+ log2(p_+)) - (p_- log2(p_-))

p_+ = 6/10 = 0.6 (Pass), p_- = 4/10 = 0.4 (Fail)

H(S) = -(0.6 log2(0.6)) - (0.4 log2(0.4))

H(S) = 0.971

Step 2: Information Gain Calculations

1. Assessment (Good, Average, Poor)

- Good: 6 examples (5 Pass, 1 Fail)

- Average: 3 examples (1 Pass, 2 Fail)

- Poor: 1 example (0 Pass, 1 Fail)

Entropy for 'Good' subset:

H(Good) = -(5/6 log2(5/6)) - (1/6 log2(1/6)) = 0.650

Entropy for 'Average' subset:

H(Average) = -(1/3 log2(1/3)) - (2/3 log2(2/3)) = 0.918

Entropy for 'Poor' subset:


H(Poor) = 0 (since all are Fail)

Weighted Entropy for 'Assessment':

H(Assessment) = (6/10) * 0.650 + (3/10) * 0.918 + (1/10) * 0

H(Assessment) = 0.665

Information Gain for 'Assessment':

IG(Assessment) = 0.971 - 0.665 = 0.306

2. Assignment (Yes, No)

- Yes: 6 examples (5 Pass, 1 Fail)

- No: 4 examples (1 Pass, 3 Fail)

Entropy for 'Yes' subset:

H(Yes) = -(5/6 log2(5/6)) - (1/6 log2(1/6)) = 0.650

Entropy for 'No' subset:

H(No) = -(1/4 log2(1/4)) - (3/4 log2(3/4)) = 0.811

Weighted Entropy for 'Assignment':

H(Assignment) = (6/10) * 0.650 + (4/10) * 0.811

H(Assignment) = 0.714

Information Gain for 'Assignment':

IG(Assignment) = 0.971 - 0.714 = 0.257

3. Project (Yes, No)

- Yes: 5 examples (4 Pass, 1 Fail)

- No: 5 examples (2 Pass, 3 Fail)


Entropy for 'Yes' subset:

H(Yes) = -(4/5 log2(4/5)) - (1/5 log2(1/5)) = 0.722

Entropy for 'No' subset:

H(No) = -(2/5 log2(2/5)) - (3/5 log2(3/5)) = 0.971

Weighted Entropy for 'Project':

H(Project) = (5/10) * 0.722 + (5/10) * 0.971

H(Project) = 0.846

Information Gain for 'Project':

IG(Project) = 0.971 - 0.846 = 0.125

4. Seminar (Good, Poor, Fair)

- Good: 4 examples (4 Pass, 0 Fail)

- Poor: 3 examples (1 Pass, 2 Fail)

- Fair: 3 examples (1 Pass, 2 Fail)

Entropy for 'Good' subset:

H(Good) = 0 (since all are Pass)

Entropy for 'Poor' subset:

H(Poor) = -(1/3 log2(1/3)) - (2/3 log2(2/3)) = 0.918

Entropy for 'Fair' subset:

H(Fair) = -(1/3 log2(1/3)) - (2/3 log2(2/3)) = 0.918


Weighted Entropy for 'Seminar':

H(Seminar) = (4/10) * 0 + (3/10) * 0.918 + (3/10) * 0.918

H(Seminar) = 0.550

Information Gain for 'Seminar':

IG(Seminar) = 0.971 - 0.550 = 0.421

Step 3: Choose the Best Attribute

The attribute with the highest information gain is 'Seminar' with IG = 0.421. Thus, 'Seminar' is

chosen as the root node.

You might also like