AMBO UNIVERSITY INSTITUTE OF TECHNOLOGY
DEPARTMENT OF COMPUTER SCIENCE
MASTERS PROGRAM IN COMPUTER SCIENCE
Seminar on Dimensionality Reduction
PREPARED BY: LENCHO JAMBARE
may 14,2019
1
Outline
• Machine Learning
• Predictive Modeling
• Dimensionality Reduction
• Why Dimensionality Reduction
• Feature Selection and Feature reduction
2
Introduction to dimensionality
reduction
• Machine Learning: machine learning is nothing
but a field of study which allows computers to
“learn” like humans without any need of explicit
programming.
• Predictive Modeling: Predictive modeling is a
probabilistic process that allows us to forecast
outcomes, on the basis of some predictors.
• These predictors are basically features that
come into play when deciding the final result,
i.e. the outcome of the model. 3
Dimensionality Reduction
• Dimensionality reduction is the process of reducing
the number of random variables under
consideration, by obtaining a set of principal
variables.
• objective: find a low-dimensional representation of
the data that retains as much information as possible
• It can be divided into feature selection and feature
extraction.
4
Why Dimensionality Reduction?
• Visualization: projection of high-dimensional
data onto 2D or 3D.
• Data compression: efficient storage and
retrieval.
• Noise removal: positive effect on query
accuracy.
• It reduces computation time.
• It also helps remove redundant features, if any
• Dimensionality reduction is an effective
approach to downsizing data
5
Feature Selection and Feature reduction
• Feature selection is the process of identifying and
selecting relevant features for your sample
• Feature selection is just visualizing the
relationship between the features and the target
variable by removing each feature against the
target variable.
6
Feature selection:
feature section have the following
advantages:
improve mining performance
Speed of learning
Predictive accuracy
Simplicity and comprehensibility of mined
results
7
Methods of attribute subset selection
Step-wise forward
selection:
The procedure starts with
an empty set of attributes.
• The best of the original
attributes is determined and
added to the set.
• At each subsequent
iteration or step, the best of
the remaining original
attributes is added to the
set.
8
Methods of attribute subset
selection(cont’d)..
2. Step-wise backward
elimination:
• The procedure starts with the
full set of attributes.
• At each step, it removes the
worst attribute remaining in the
set.
3. Combination forward selection
and backward elimination:
• at each step one selects the best
attribute and removes the worst
from among the remaining
attributes.
9
Methods of attribute subset selection(cont’d)…
4. Decision tree induction:
• Decision tree induction constructs a flow-
chart-like structure where each internal
(non-leaf) node denotes a test on an
attribute,
• At each node, the algorithm chooses the
best" attribute to partition the data into
individual classes.
• When decision tree induction is used for
attribute subset selection, a tree is
constructed from the given data.
• All attributes that do not appear in the tree
are assumed to be irrelevant.
• The set of attributes appearing in the tree,
form the reduced subset of attributes.
10
Feature Reduction
• We create new features as functions of the
existing ones (instead of choosing a subset of
the existing features)
• This could be achieved in :
• Unsupervised manner (e.g., principal
component analysis chooses a projection that
is efficient for representation)
11
Principal Component Analysis
• The aim is to find a new feature space with minimum
loss of information
• It is assumed that the "most important" aspects of the
data lies on the projection with the greatest variance
• Principal component analysis (PCA) transforms the
data to a new coordinate system such that :
• The greatest variance lies on the first coordinate (the
first principal component), the second greatest
variance lies on the second coordinate (the second
principal component), and so on
12
ntio n
r atte
yo u
yo u for
han k
T
13