MCA3 (DS) Unit 4 ML
MCA3 (DS) Unit 4 ML
Unsupervised
learning
Overview
• Introduction
• Decision Tree Representation
• Appropriate problems for Decision tree
• Learning Algorithm
• Hypothesis Space Search
• Inductive Bias in Decision Tree Learning
• Issues in Decision Tree Learning
• Locally Weighted Regression
• Radial Bases, Functions,
• Case Based Reasoning
Introduction
• Decision Tree algorithm belongs to the family of supervised learning
algorithms. Unlike other supervised learning algorithms, the decision tree
algorithm can be used for solving regression and classification
problems too.
• The goal of using a Decision Tree is to create a training model that can
use to predict the class or value of the target variable by learning simple
decision rules inferred from prior data(training data).
• In Decision Trees, for predicting a class label for a record we start from
the root of the tree. We compare the values of the root attribute with the
record’s attribute. On the basis of comparison, we follow the branch
corresponding to that value and jump to the next node.
Decision Tree Representation
Outlook
A Decision Tree for the
concept PlayTennis
Sunny Overcast Rain
No Yes No Yes
Decision Tree Representation (cont.)
Outlook
• Each path corresponds to a conjunction of attribute tests. For
example, if the instance is (Outlook=sunny, Temperature=Hot,
Sunny Rain
Humidity=high, Wind=Strong) then the path of (Outlook=Sunny ∧
Humidity=High) is matched so that the
Overcast
target value would be NO as shown in the tree.
• Main question
– Which attribute should be tested at the root of the (sub)tree?
• Greedy search using some statistical measure
• Information gain
– A quantitative measure of the worth of an attribute
– How well a given attribute separates the training example according to
their target classification
– Information gain measures the expected reduction in entropy
Learning Algorithm Temperature
sunny rain
strong weak
overcast
no yes no yes ?
What is entropy
• In decision tree machine learning, entropy is a measure used to quantify the impurity or
disorder within a set of data.
• It's a concept borrowed from information theory and is particularly useful in decision tree
algorithms, such as ID3, C4.5, and CART, for determining the best attribute to split the
data at each node.
• Entropy helps in deciding the order of attributes in the nodes of the tree during the
construction phase.
• The goal is to create splits that result in nodes containing data points that are as
homogeneous as possible with respect to the target variable.
Learning Algorithm (cont.)
• Entropy
– characterizes the (im)purity of an arbitrary of examples
For example
• The information required for classification of Table 3.2
=-(9/14)log2(9/14)-(5/14)log2(5/14)=0.940
Learning Algorithm (cont.)
What is Overfitting?
Overfitting is a common problem that needs to be handled while training
a decision tree model. Overfitting occurs when a model fits too
closely to the training data and may become less accurate when
encountering new data or predicting future outcomes. In an overfit
condition, a model memorizes the noise of the training data and fails to
capture essential patterns
21
Issues in Decision Tree Learning (cont.)
• Avoiding overfitting
– How can we avoid overfitting?
• Stop growing before it reaches the point where it perfectly classifies the
training data
• Grow full tree, then post-prune
– How to select best tree?
• Measure performance statistically over training data
• Measure performance over separate validation data set
• MDL: minimize the complexity for encoding the training examples and
the decision tress
22
Issues in Decision Tree Learning (cont.)
23
Issues in Decision Tree Learning
Decision tree learning is a powerful and popular machine learning technique, but it's not
without its challenges and limitations. Here are some key issues associated with decision
tree learning:
1. Overfitting: Decision trees can easily overfit the training data, especially when they grow
to be very deep and complex. This results in the model learning noise or specific patterns
that are unique to the training set but do not generalize well to unseen data.
2. High Variance: Small changes in the training data can lead to significantly different trees.
This high variance can make decision trees unstable and sensitive to variations in the
dataset.
3. Feature Importance and Correlation: Decision trees can struggle with identifying and
using correlated features effectively. Redundant or highly correlated features might affect
the importance assigned to individual features or cause biased splits in the tree.
Issues in Decision Tree Learning
4. Bias in Attribute Selection Heuristics: The choice of attribute selection heuristics
(e.g., information gain, Gini impurity) can introduce biases towards certain types of
attributes or certain types of splits, impacting the final tree structure and
performance.
5. Handling Missing Values: While some decision tree algorithms handle missing
values well by making assumptions about the missing data, others might struggle or
require additional preprocessing steps.
Addressing these issues often involves using ensemble methods like Random
Forests, Gradient Boosting, or implementing techniques like cross-validation,
pruning, or feature engineering to improve decision tree models' performance and
robustness.
Locally Weighted Linear Regression