0% found this document useful (0 votes)

10 views

MCA3 (DS) Unit 4 ML

The document provides an overview of supervised and unsupervised learning algorithms, focusing on decision tree algorithms. It describes how decision trees work by splitting the data into nodes and branches based on attribute values to classify or predict target variables. The key algorithms discussed are ID3, C4.5, and CART. It explains that decision trees use entropy and information gain to determine the optimal attribute to split on at each node, with the goal of creating homogenous leaf nodes. Examples are also given to illustrate how a decision tree is constructed from sample training data.

Uploaded by

Ruparel Education Pvt. Ltd.

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views

MCA3 (DS) Unit 4 ML

Uploaded by

Ruparel Education Pvt. Ltd.

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 29

Supervised and

Unsupervised
learning
Overview
• Introduction
• Decision Tree Representation
• Appropriate problems for Decision tree
• Learning Algorithm
• Hypothesis Space Search
• Inductive Bias in Decision Tree Learning
• Issues in Decision Tree Learning
• Locally Weighted Regression
• Radial Bases, Functions,
• Case Based Reasoning
Introduction
• Decision Tree algorithm belongs to the family of supervised learning
algorithms. Unlike other supervised learning algorithms, the decision tree
algorithm can be used for solving regression and classification
problems too.
• The goal of using a Decision Tree is to create a training model that can
use to predict the class or value of the target variable by learning simple
decision rules inferred from prior data(training data).
• In Decision Trees, for predicting a class label for a record we start from
the root of the tree. We compare the values of the root attribute with the
record’s attribute. On the basis of comparison, we follow the branch
corresponding to that value and jump to the next node.
Decision Tree Representation

• Decision trees classify instances by sorting them down the tree

from the root to some leaf node
• A node
– Specifies some attribute of an instance to be tested
• A branch
– Corresponds to one of the possible values for an attribute
Different decision tree algorithm

ID3 → (extension of D3)

C4.5 → (successor of ID3)

CART → (Classification And Regression Tree)

CHAID → (Chi-square automatic interaction detection Performs multi-level splits

when computing classification trees)

MARS → (multivariate adaptive regression splines)

Decision Tree Representation (cont.)

Outlook
A Decision Tree for the
concept PlayTennis
Sunny Overcast Rain

Humidity Yes Wind

High Normal Strong

Strong Weak

No Yes No Yes
Decision Tree Representation (cont.)
Outlook
• Each path corresponds to a conjunction of attribute tests. For
example, if the instance is (Outlook=sunny, Temperature=Hot,
Sunny Rain
Humidity=high, Wind=Strong) then the path of (Outlook=Sunny ∧
Humidity=High) is matched so that the
Overcast
target value would be NO as shown in the tree.

Humidity Wind • A decision tree represents a disjunction of

Yes
conjunction of constraints on the attribute values of
instances. For example, three positive instances can
High Normal Strong Weak be represented as (Outlook=Sunny ∧ Humidity=normal)
∨ (Outlook=Overcast) ∨ (Outlook=Rain ∧Wind=Weak) as
shown in the tree.
No Yes No Yes
What is the merit of tree representation?
Decision Tree Representation (cont.)
• Appropriate Problems for Decision Tree Learning
– Instances are represented by attribute-value pairs
– The target function has discrete output values
– Disjunctive descriptions may be required
– The training data may contain errors
• Both errors in classification of the training examples and errors in
the attribute values
– The training data may contain missing attribute values
– Suitable for classification
Learning Algorithm

• Main question
– Which attribute should be tested at the root of the (sub)tree?
• Greedy search using some statistical measure

• Information gain
– A quantitative measure of the worth of an attribute
– How well a given attribute separates the training example according to
their target classification
– Information gain measures the expected reduction in entropy
Learning Algorithm Temperature

cool mild hot

outlook outlook windy

sunny rain sunny rain strong weak

overcast overcast

yes yes yes no

windy windy humid humid

strong Normal High Normal High

strong weak
weak
windy yes outlook yes
no yes yes no

sunny rain
strong weak
overcast

no yes no yes ?
What is entropy
• In decision tree machine learning, entropy is a measure used to quantify the impurity or
disorder within a set of data.

• It's a concept borrowed from information theory and is particularly useful in decision tree
algorithms, such as ID3, C4.5, and CART, for determining the best attribute to split the
data at each node.

• Entropy helps in deciding the order of attributes in the nodes of the tree during the
construction phase.

• The goal is to create splits that result in nodes containing data points that are as
homogeneous as possible with respect to the target variable.
Learning Algorithm (cont.)
• Entropy
– characterizes the (im)purity of an arbitrary of examples

For example
• The information required for classification of Table 3.2
=-(9/14)log2(9/14)-(5/14)log2(5/14)=0.940
Learning Algorithm (cont.)

The formula for entropy in the context of decision trees is often

expressed as:
Learning Algorithm (cont.)

• When entropy is high, it indicates high disorder or uncertainty in the

dataset. Conversely, when entropy is low, it suggests the data is more
ordered or homogeneous.
• During the construction of a decision tree, the attribute with the lowest
entropy (or highest information gain, which is the reduction in entropy)
is chosen as the splitting criterion, aiming to partition the data into
subsets that are as pure as possible in terms of the target variable.
• Lower entropy after a split indicates that the resulting subsets are more
homogeneous, making decisions or predictions more accurate within
each subset.
Learning Algorithm (cont.)

• Information gain and entropy

✓ Values (A): the set of all possible values for attribute A

✓ Sv : the subset of S for which attribute A has value v

– First term: the entropy of the original collection

– Second term: the expected value of the entropy after S is partitioned using attribute A
• Gain (S ,A)
– The expected reduction in entropy caused by knowing the value of attribute A
– The information provided about the target function value, given the value of some other
attribute A 15
Learning Algorithm (cont.)
• ID3 (Examples, Target_attribute, Attributes)
– Create a Root node for the tree
– If all Examples are positive, return the single node tree Root, with
label= +
– If all Examples are negative, return the single node tree Root, with
label= −
– If Attributes is empty, return the single-node tree Root, with label =
most common value of Target_attribute in Examples
– Otherwise begin
Continued to Next Slide16➔
An Illustrative Example

Day Outlook Temperature Humidity Wind Play Tennis

D1 Sunny Hot High Weak No
D2 Sunny Hot High Strong No
D3 Overcast Hot High Weak Yes
D4 Rain Mild High Weak Yes
D5 Rain Cool Normal Weak Yes
D6 Rain Cool Normal Strong No
D7 Overcast Cool Normal Strong Yes
D8 Sunny Mild High Weak No
D9 Sunny Cool Normal Weak Yes
D10 Rain Mild Normal Weak Yes
D11 Sunny Mild Normal Strong Yes
D12 Overcast Mild High Strong Yes
D13 Overcast Hot Normal Weak Yes
D14 Rain Mild High Strong No
17
Training examples for the target concept PlayTennis
• Selecting the root node
– The information gain values for all four attributes
• Gain(S, Outlook)= 0.246  selected as root attribute
• Gain(S, Humidity)= 0.151
• Gain(S, Wind)= 0.048
• Gain(S, Temperature)= 0.029
• Adding a subtree
Hypothesis Space Search

Hypothesis space search refers to the

process of exploring and evaluating
different hypotheses or models in machine
learning to find the most suitable one for a
given problem.
It involves searching through a space of
possible models or configurations to
identify the one that best fits the data and
optimizes a defined objective (like
accuracy, error minimization, etc.).
19
Inductive Bias in Decision Tree Learning
•Approximate bias of ID3: Shorter trees are preferred over longer trees.
Trees that place high information gain attributes close to the root are
preferred over those that do not.
•Occam's Razor: prefer the simplest hypothesis that fits the data.
•Preference or search bias: preference for certain hypotheses over others
with no hard restriction on the hypotheses that can be enumerated. ID3
demonstrates a preference bias.
•Restriction or language bias: a categorical restriction on the set of
hypotheses considered. The candidate elimination algorithm demonstrates a
restriction bias.
•In general, it is better to have a preference bias than a restriction bias.
However, some learning systems have both biases. 20
Issues in Decision Tree Learning (cont.)

What is Overfitting?
Overfitting is a common problem that needs to be handled while training
a decision tree model. Overfitting occurs when a model fits too
closely to the training data and may become less accurate when
encountering new data or predicting future outcomes. In an overfit
condition, a model memorizes the noise of the training data and fails to
capture essential patterns

21
Issues in Decision Tree Learning (cont.)

• Avoiding overfitting
– How can we avoid overfitting?
• Stop growing before it reaches the point where it perfectly classifies the
training data
• Grow full tree, then post-prune
– How to select best tree?
• Measure performance statistically over training data
• Measure performance over separate validation data set
• MDL: minimize the complexity for encoding the training examples and
the decision tress
22
Issues in Decision Tree Learning (cont.)

23
Issues in Decision Tree Learning
Decision tree learning is a powerful and popular machine learning technique, but it's not
without its challenges and limitations. Here are some key issues associated with decision
tree learning:

1. Overfitting: Decision trees can easily overfit the training data, especially when they grow
to be very deep and complex. This results in the model learning noise or specific patterns
that are unique to the training set but do not generalize well to unseen data.

2. High Variance: Small changes in the training data can lead to significantly different trees.
This high variance can make decision trees unstable and sensitive to variations in the
dataset.

3. Feature Importance and Correlation: Decision trees can struggle with identifying and
using correlated features effectively. Redundant or highly correlated features might affect
the importance assigned to individual features or cause biased splits in the tree.
Issues in Decision Tree Learning
4. Bias in Attribute Selection Heuristics: The choice of attribute selection heuristics
(e.g., information gain, Gini impurity) can introduce biases towards certain types of
attributes or certain types of splits, impacting the final tree structure and
performance.

5. Handling Missing Values: While some decision tree algorithms handle missing
values well by making assumptions about the missing data, others might struggle or
require additional preprocessing steps.

Addressing these issues often involves using ensemble methods like Random
Forests, Gradient Boosting, or implementing techniques like cross-validation,
pruning, or feature engineering to improve decision tree models' performance and
robustness.
Locally Weighted Linear Regression

Locally Weighted Regression (LWR) is a non-parametric regression

technique used for making predictions based on locally weighted
linear regression. It differs from global regression methods (like linear
regression) by giving more weight to data points in the district of the
query point when making predictions.
Radial Bases Functions:
Radial Basis Functions (RBFs) are mathematical functions whose
value depends on the distance between the input and a center. They
are commonly used in various fields including machine learning,
interpolation, approximation, and signal processing.
Case Based Reasoning
Case-Based Reasoning (CBR) is an AI reasoning paradigm that solves
new problems by retrieving and reusing solutions from similar past
cases. It operates on the idea that similar problems tend to have
similar solutions, and it mimics human problem-solving by leveraging
past experiences.
THANK YOU

ML UNIT 2-2-40
No ratings yet
ML UNIT 2-2-40
39 pages
ML Unit-2.1
No ratings yet
ML Unit-2.1
17 pages
Module 3-Decision Tree Learning
100% (1)
Module 3-Decision Tree Learning
33 pages
2.decision Tree
No ratings yet
2.decision Tree
56 pages
Screenshot 2024-02-06 at 1.43.15 PM
No ratings yet
Screenshot 2024-02-06 at 1.43.15 PM
66 pages
decision_tree_learning_lecture
No ratings yet
decision_tree_learning_lecture
13 pages
ML UNIT 2 Decision Tree
No ratings yet
ML UNIT 2 Decision Tree
109 pages
Unit 3
No ratings yet
Unit 3
46 pages
Unit 3 MLT
No ratings yet
Unit 3 MLT
18 pages
Module 3 DecisionTree Notes
100% (1)
Module 3 DecisionTree Notes
14 pages
AI&Ml-module 4 (Part 1)
No ratings yet
AI&Ml-module 4 (Part 1)
85 pages
AI&Ml-module 4 (Complete)
No ratings yet
AI&Ml-module 4 (Complete)
124 pages
Unit-3 (1)
No ratings yet
Unit-3 (1)
81 pages
New Module 3 Part1
No ratings yet
New Module 3 Part1
69 pages
Decision Tree
No ratings yet
Decision Tree
18 pages
Unit-3 MLT
No ratings yet
Unit-3 MLT
74 pages
Decision Trees
No ratings yet
Decision Trees
53 pages
Module - 2 Decision Tree Learning
No ratings yet
Module - 2 Decision Tree Learning
79 pages
ml unit 3 part 1
No ratings yet
ml unit 3 part 1
42 pages
Decision Trees
No ratings yet
Decision Trees
15 pages
Springer.linguistic Decision Trees for Classification-2014
No ratings yet
Springer.linguistic Decision Trees for Classification-2014
43 pages
Module 2 Notes
No ratings yet
Module 2 Notes
20 pages
Lect 8-Decision Tree-2
No ratings yet
Lect 8-Decision Tree-2
16 pages
ML Lecture 3
No ratings yet
ML Lecture 3
13 pages
ML-3-Decision Tree
No ratings yet
ML-3-Decision Tree
17 pages
unit 3
No ratings yet
unit 3
90 pages
Decision Trees
No ratings yet
Decision Trees
7 pages
Module 3
No ratings yet
Module 3
103 pages
Mod 3 AIML QB With Answers
No ratings yet
Mod 3 AIML QB With Answers
26 pages
Decision Tree
No ratings yet
Decision Tree
14 pages
Unit 3
No ratings yet
Unit 3
33 pages
Unit 2 1
No ratings yet
Unit 2 1
15 pages
Video Tutorial: Decision Tree Learning
No ratings yet
Video Tutorial: Decision Tree Learning
21 pages
M01 Tree-Based Methods
No ratings yet
M01 Tree-Based Methods
38 pages
ML - Unit 2 - Part I
No ratings yet
ML - Unit 2 - Part I
15 pages
Class 16 Decision Tree
No ratings yet
Class 16 Decision Tree
45 pages
07 - ML - Decision Tree
No ratings yet
07 - ML - Decision Tree
37 pages
Decision Tree Learning and Inductive Inference
No ratings yet
Decision Tree Learning and Inductive Inference
37 pages
Decision Tree Using ID3 Algorithm
No ratings yet
Decision Tree Using ID3 Algorithm
40 pages
Module 3
No ratings yet
Module 3
102 pages
3 Decision Tree Learning
No ratings yet
3 Decision Tree Learning
38 pages
Unit II Part 1
No ratings yet
Unit II Part 1
62 pages
L3 - Decision Trees
No ratings yet
L3 - Decision Trees
28 pages
Module - 3 - DTL & Ann
No ratings yet
Module - 3 - DTL & Ann
10 pages
FALLSEM2024-25 BCSE209L TH VL2024250101598 2024-08-05 Reference-Material-I
No ratings yet
FALLSEM2024-25 BCSE209L TH VL2024250101598 2024-08-05 Reference-Material-I
31 pages
Lec-3-Decision Trees
No ratings yet
Lec-3-Decision Trees
47 pages
Module 3 Chap 3 Decision Tree Learning
No ratings yet
Module 3 Chap 3 Decision Tree Learning
79 pages
Lecture 04 Decession Trees 04112022 015118pm
No ratings yet
Lecture 04 Decession Trees 04112022 015118pm
43 pages
DMDW-CO3-SESSION-14
No ratings yet
DMDW-CO3-SESSION-14
55 pages
Module 3-1 PDF
No ratings yet
Module 3-1 PDF
43 pages
Aiml M4 C1
No ratings yet
Aiml M4 C1
101 pages
Cse 445 Lecture 8 Mma
No ratings yet
Cse 445 Lecture 8 Mma
107 pages
Session 17-Decision Tree
No ratings yet
Session 17-Decision Tree
16 pages
Machine Learning: MVJ21CS62
No ratings yet
Machine Learning: MVJ21CS62
12 pages
UNIT II 2.1 ML Decision Tree Learning
No ratings yet
UNIT II 2.1 ML Decision Tree Learning
55 pages
2025-Lecture07-P1-ID3
No ratings yet
2025-Lecture07-P1-ID3
41 pages
Decision Tree 2
No ratings yet
Decision Tree 2
20 pages
Decision Tree
No ratings yet
Decision Tree
20 pages
Multiple Integrals, A Collection of Solved Problems
From Everand
Multiple Integrals, A Collection of Solved Problems
Steven Tan
No ratings yet
Alternating Decision Tree: Fundamentals and Applications
From Everand
Alternating Decision Tree: Fundamentals and Applications
Fouad Sabry
No ratings yet
12_EM_Account
No ratings yet
12_EM_Account
8 pages
MCA 2 Repeater
No ratings yet
MCA 2 Repeater
1 page
Unit 2 - Collections
No ratings yet
Unit 2 - Collections
22 pages
Eric Android Kotlin Unit 4
No ratings yet
Eric Android Kotlin Unit 4
21 pages
Unit 3-Null Safety
No ratings yet
Unit 3-Null Safety
15 pages
Unit 3 MapReduce Part 2
No ratings yet
Unit 3 MapReduce Part 2
12 pages
Real-Time Traffic Accidents Post-Impact Prediction - Based On Crowdsourcing Data
No ratings yet
Real-Time Traffic Accidents Post-Impact Prediction - Based On Crowdsourcing Data
11 pages
Deep Learning in Wireless Communications: Prof. Dola Saha
No ratings yet
Deep Learning in Wireless Communications: Prof. Dola Saha
147 pages
Software Engineering for Data Scientists (MEAP V2) Andrew Treadway - Download the ebook with all fully detailed chapters
100% (2)
Software Engineering for Data Scientists (MEAP V2) Andrew Treadway - Download the ebook with all fully detailed chapters
56 pages
Credit Card Fraud Detection Using State-Of-The-Art Machine Learning and Deep Learning Algorithms
No ratings yet
Credit Card Fraud Detection Using State-Of-The-Art Machine Learning and Deep Learning Algorithms
16 pages
Lung Disease Detection Using Deep Learning
No ratings yet
Lung Disease Detection Using Deep Learning
20 pages
Crime Prediction Using Machine Learning
No ratings yet
Crime Prediction Using Machine Learning
19 pages
Ling 2024
No ratings yet
Ling 2024
18 pages
Learning Representations On Logs For AIOps
No ratings yet
Learning Representations On Logs For AIOps
11 pages
Legal Document Similarity Matching Based On Ensemble Learning
No ratings yet
Legal Document Similarity Matching Based On Ensemble Learning
13 pages
ML Lab Session 05 - CNN Implementation
No ratings yet
ML Lab Session 05 - CNN Implementation
4 pages
Domain Generalization Via Aggregation and Separation For Audio Deepfake Detection
No ratings yet
Domain Generalization Via Aggregation and Separation For Audio Deepfake Detection
15 pages
Paper 84
No ratings yet
Paper 84
4 pages
Literature Review
No ratings yet
Literature Review
7 pages
Xii Ai RTP1
No ratings yet
Xii Ai RTP1
3 pages
Automatic Severity Classification of Diabetic Retinopathy Based On Densenet and Convolutional Block Attention Module
No ratings yet
Automatic Severity Classification of Diabetic Retinopathy Based On Densenet and Convolutional Block Attention Module
10 pages
191IT7310Machine LearningQB
No ratings yet
191IT7310Machine LearningQB
27 pages
Semester Suggestion Solution
No ratings yet
Semester Suggestion Solution
26 pages
Convolutional - Autoencoder - and - Transfer - Learning - For - Automatic - Virtual - Metrology (IEEE RA-L, July 2022)
No ratings yet
Convolutional - Autoencoder - and - Transfer - Learning - For - Automatic - Virtual - Metrology (IEEE RA-L, July 2022)
8 pages
Plant Detection and Counting Enhancing Precision Agriculture in UAV and General Scenes
No ratings yet
Plant Detection and Counting Enhancing Precision Agriculture in UAV and General Scenes
10 pages
Machine Learning Semester Paper
No ratings yet
Machine Learning Semester Paper
31 pages
VS CODE
No ratings yet
VS CODE
54 pages
ML Unit-2
No ratings yet
ML Unit-2
17 pages
Learning To Prompt With Text Only Supervision For Vision-Language Models
No ratings yet
Learning To Prompt With Text Only Supervision For Vision-Language Models
15 pages
TETDEDXKelly Temple 0225E 13104
No ratings yet
TETDEDXKelly Temple 0225E 13104
132 pages
A Very Deep Transfer Learning Model For Vehicle Damage Detection and Localization
No ratings yet
A Very Deep Transfer Learning Model For Vehicle Damage Detection and Localization
4 pages
Android Based Malware Detection Technique Using Machine Learning Algorithms
No ratings yet
Android Based Malware Detection Technique Using Machine Learning Algorithms
6 pages
Ethics and Privacy in Data Analytics
No ratings yet
Ethics and Privacy in Data Analytics
16 pages
Air-to-Air Visual Detection of Micro-UAVs An Experimental Evaluation of Deep Learning
No ratings yet
Air-to-Air Visual Detection of Micro-UAVs An Experimental Evaluation of Deep Learning
8 pages
Ai Safety in 2024 and Beyond
No ratings yet
Ai Safety in 2024 and Beyond
28 pages
PWDGAN_Generating_Adversarial_Malicious_URL_Examples_for_Deceiving_Black-Box_Phishing_Website_Detector_using_GANs
No ratings yet
PWDGAN_Generating_Adversarial_Malicious_URL_Examples_for_Deceiving_Black-Box_Phishing_Website_Detector_using_GANs
4 pages

MCA3 (DS) Unit 4 ML

Uploaded by

MCA3 (DS) Unit 4 ML

Uploaded by

Supervised and

• Decision trees classify instances by sorting them down the tree

ID3 → (extension of D3)

C4.5 → (successor of ID3)

CART → (Classification And Regression Tree)

CHAID → (Chi-square automatic interaction detection Performs multi-level splits

MARS → (multivariate adaptive regression splines)

Humidity Yes Wind

High Normal Strong

Humidity Wind • A decision tree represents a disjunction of

cool mild hot

outlook outlook windy

sunny rain sunny rain strong weak

yes yes yes no

strong Normal High Normal High

The formula for entropy in the context of decision trees is often

• When entropy is high, it indicates high disorder or uncertainty in the

• Information gain and entropy

✓ Values (A): the set of all possible values for attribute A

– First term: the entropy of the original collection

Day Outlook Temperature Humidity Wind Play Tennis

Hypothesis space search refers to the

Locally Weighted Regression (LWR) is a non-parametric regression

You might also like