0% found this document useful (0 votes)
9 views13 pages

Deminesionality Reduction

The document is a seminar presentation on Dimensionality Reduction in the context of a Master's program in Computer Science. It covers key concepts such as machine learning, predictive modeling, and the importance of dimensionality reduction for data visualization, compression, and noise removal. The presentation also discusses methods for feature selection and feature reduction, including techniques like Principal Component Analysis.

Uploaded by

lencho03406
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views13 pages

Deminesionality Reduction

The document is a seminar presentation on Dimensionality Reduction in the context of a Master's program in Computer Science. It covers key concepts such as machine learning, predictive modeling, and the importance of dimensionality reduction for data visualization, compression, and noise removal. The presentation also discusses methods for feature selection and feature reduction, including techniques like Principal Component Analysis.

Uploaded by

lencho03406
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd

AMBO UNIVERSITY INSTITUTE OF TECHNOLOGY

DEPARTMENT OF COMPUTER SCIENCE

MASTERS PROGRAM IN COMPUTER SCIENCE

Seminar on Dimensionality Reduction

PREPARED BY: LENCHO JAMBARE

may 14,2019

1
Outline
• Machine Learning
• Predictive Modeling
• Dimensionality Reduction
• Why Dimensionality Reduction
• Feature Selection and Feature reduction

2
Introduction to dimensionality
reduction
• Machine Learning: machine learning is nothing
but a field of study which allows computers to
“learn” like humans without any need of explicit
programming.
• Predictive Modeling: Predictive modeling is a
probabilistic process that allows us to forecast
outcomes, on the basis of some predictors.
• These predictors are basically features that
come into play when deciding the final result,
i.e. the outcome of the model. 3
Dimensionality Reduction
• Dimensionality reduction is the process of reducing
the number of random variables under
consideration, by obtaining a set of principal
variables.
• objective: find a low-dimensional representation of
the data that retains as much information as possible

• It can be divided into feature selection and feature


extraction.

4
Why Dimensionality Reduction?
• Visualization: projection of high-dimensional
data onto 2D or 3D.
• Data compression: efficient storage and
retrieval.
• Noise removal: positive effect on query
accuracy.
• It reduces computation time.
• It also helps remove redundant features, if any
• Dimensionality reduction is an effective
approach to downsizing data
5
Feature Selection and Feature reduction

• Feature selection is the process of identifying and


selecting relevant features for your sample

• Feature selection is just visualizing the


relationship between the features and the target
variable by removing each feature against the
target variable.

6
Feature selection:

 feature section have the following


advantages:
improve mining performance
Speed of learning
Predictive accuracy
Simplicity and comprehensibility of mined
results

7
Methods of attribute subset selection

 Step-wise forward
selection:
 The procedure starts with
an empty set of attributes.
• The best of the original
attributes is determined and
added to the set.
• At each subsequent
iteration or step, the best of
the remaining original
attributes is added to the
set.
8
Methods of attribute subset
selection(cont’d)..
2. Step-wise backward
elimination:
• The procedure starts with the
full set of attributes.
• At each step, it removes the
worst attribute remaining in the
set.
3. Combination forward selection
and backward elimination:
• at each step one selects the best
attribute and removes the worst
from among the remaining
attributes.

9
Methods of attribute subset selection(cont’d)…
4. Decision tree induction:
• Decision tree induction constructs a flow-
chart-like structure where each internal
(non-leaf) node denotes a test on an
attribute,
• At each node, the algorithm chooses the
best" attribute to partition the data into
individual classes.
• When decision tree induction is used for
attribute subset selection, a tree is
constructed from the given data.
• All attributes that do not appear in the tree
are assumed to be irrelevant.
• The set of attributes appearing in the tree,
form the reduced subset of attributes.
10
Feature Reduction
• We create new features as functions of the
existing ones (instead of choosing a subset of
the existing features)
• This could be achieved in :
• Unsupervised manner (e.g., principal
component analysis chooses a projection that
is efficient for representation)

11
Principal Component Analysis
• The aim is to find a new feature space with minimum
loss of information
• It is assumed that the "most important" aspects of the
data lies on the projection with the greatest variance
• Principal component analysis (PCA) transforms the
data to a new coordinate system such that :
• The greatest variance lies on the first coordinate (the
first principal component), the second greatest
variance lies on the second coordinate (the second
principal component), and so on
12
ntio n
r atte
yo u
yo u for
han k
T

13

You might also like