0% found this document useful (0 votes)

8 views

2. PCA

Dimension Reduction is a process in pattern recognition that simplifies high-dimensional data into lower dimensions while retaining essential information. Principal Component Analysis (PCA) is a key technique for this, transforming original variables into orthogonal principal components to capture variance. While PCA offers advantages like improved performance and visualization, it also has limitations such as potential information loss and reduced interpretability of independent variables.

Uploaded by

arulx06

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views

2. PCA

Uploaded by

arulx06

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 22

Data Analysis

Dr. C Santhosh Kumar

Dimension Reduction:

In pattern recognition, Dimension Reduction is defined as-

• It is a process of converting a data set having vast dimensions into a data

set with lesser dimensions.

• It ensures that the converted data set conveys similar information

concisely.
Example:

Consider the following example:

• The following graph shows two dimensions x1 and x2.

• x1 represents the measurement of several objects in cm.

• x2 represents the measurement of several objects in inches.

In machine learning,

• Using both these dimensions convey similar information.

• Also, they introduce a lot of noise in the system.

• So, it is better to use just one dimension.

Using dimension reduction techniques-

• We convert the dimensions of data from 2 dimensions (x1 and x2) to 1 dimension (z1).

• It makes the data relatively easier to explain.

Dimension Reduction Techniques:
Principal Component Analysis(PCA):

• Principal Component Analysis is a well-known dimension reduction technique.

• It transforms the variables into a new set of variables called as principal components.

• These principal components are linear combination of original variables and are
orthogonal.

• The first principal component accounts for most of the possible variation of original
data.

• The second principal component does its best to capture the variance in the data.

• There can be only two principal components for a two-dimensional data set.
PCA Algorithm:
The steps involved in PCA Algorithm are as follows-

• Step-01: Get data.

• Step-02: Compute the mean vector (µ).

• Step-03: Subtract mean from the given data.

• Step-04: Calculate the covariance matrix.

• Step-05: Calculate the eigen vectors and eigen values of the covariance matrix.

• Step-06: Choosing components and forming a feature vector.

• Step-07: Deriving the new data set.

PCA EXAMPLE PROBLEM:

1. Compute the principal component of following data-

• CLASS 1

• X=2,3,4

• Y=1,5,3

• CLASS 2

• X=5,6,7

• Y=6,7,8
Step-01:
Get data.

The given feature vectors are-

• x1 = (2, 1)

• x2 = (3, 5)

• x3 = (4, 3)

• x4 = (5, 6)

• x5 = (6, 7)

• x6 = (7, 8)
Step-02:

Calculate the mean vector (µ).

Mean vector (µ) = ((2 + 3 + 4 + 5 + 6 + 7) / 6, (1 + 5 + 3 + 6 + 7 + 8) / 6)

= (4.5, 5)
Step-03:

Subtract mean vector (µ) from the given feature vectors.

• x1 – µ = (2 – 4.5, 1 – 5) = (-2.5, -4)

• x2 – µ = (3 – 4.5, 5 – 5) = (-1.5, 0)

• x3 – µ = (4 – 4.5, 3 – 5) = (-0.5, -2)

• x4 – µ = (5 – 4.5, 6 – 5) = (0.5, 1)

• x5 – µ = (6 – 4.5, 7 – 5) = (1.5, 2)

• x6 – µ = (7 – 4.5, 8 – 5) = (2.5, 3)
Step-04:
• Covariance matrix is given by
Covariance matrix = (m1 + m2 + m3 + m4 + m5 + m6) / 6

On adding the above matrices and dividing by 6, we get:

Step-05:

Calculate the eigen values and eigen vectors of the covariance matrix.

λ is an eigen value for a matrix M if it is a solution of the characteristic equation |M – λI| = 0.

From here,

(2.92 – λ)(5.67 – λ) – (3.67 x 3.67) = 0

16.56 – 2.92λ – 5.67λ + λ2 – 13.47 = 0

λ2 – 8.59λ + 3.09 = 0
Solving this quadratic equation, we get λ = 8.22, 0.38 Thus, two eigen values are λ 1 = 8.22 and λ2 = 0.38.

Clearly, the second eigen value is very small compared to the first eigen value. So, the second eigen vector
can be left out.

Eigen vector corresponding to the greatest eigen value is the principal component for the given data set. So.
we find the eigen vector corresponding to eigen value λ 1.

We use the following equation to find the eigen vector-

MX = λX

where-

• M = Covariance Matrix

• X = Eigen vector

• λ = Eigen value
Solving these, we get-
2.92X1 + 3.67X2 = 8.22X1
3.67X1 + 5.67X2 = 8.22X2

On simplification, we get-
5.3X1 = 3.67X2 ………(1)
3.67X1 = 2.55X2 ………(2)

From (1) and (2), X1 = 0.69X2

From (2), the eigen vector is-
Step 06:
Step 07:
P11 P12 P13 P14 P15 P16

-20.9 -3.8 -8.5 4.8 11.1 17.3

ADVANTAGES:

• Removes Correlated Features.

• Improves Algorithm Performance.

• Reduces Overfitting.

• Improves Visualization.
LIMITATIONS OF PCA:
• Independent variables become less interpretable.

• Data standardization is must before PCA.

• Information loss.

Cognitive Therapy For Challenging Problems
No ratings yet
Cognitive Therapy For Challenging Problems
2 pages
PaR Bloomsbury - Midgelow Vida
No ratings yet
PaR Bloomsbury - Midgelow Vida
42 pages
AHBPA
100% (1)
AHBPA
3,973 pages
Dimensionality Reduction Using PCA (Principal Component Analysis)
No ratings yet
Dimensionality Reduction Using PCA (Principal Component Analysis)
13 pages
Machine Learning Solved Mcqs Set 1
100% (7)
Machine Learning Solved Mcqs Set 1
6 pages
CLASSROOM EMIS Module
86% (14)
CLASSROOM EMIS Module
19 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
8 pages
Principal Components Analysis (PCA) Final
No ratings yet
Principal Components Analysis (PCA) Final
23 pages
Unit-3
No ratings yet
Unit-3
28 pages
Lecture 9 - Data Reduction
No ratings yet
Lecture 9 - Data Reduction
36 pages
MLPDF 2
No ratings yet
MLPDF 2
9 pages
Pattern Recognition PCA: Subrata Datta Dept. of AIML Nsec
No ratings yet
Pattern Recognition PCA: Subrata Datta Dept. of AIML Nsec
19 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
12 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
13 pages
Unit - IV - DIMENSIONALITY REDUCTION AND GRAPHICAL MODELS
No ratings yet
Unit - IV - DIMENSIONALITY REDUCTION AND GRAPHICAL MODELS
59 pages
Principal Component Analysis
100% (1)
Principal Component Analysis
10 pages
EXP-15
No ratings yet
EXP-15
12 pages
1501589578da-mod15-Q1-e-text
No ratings yet
1501589578da-mod15-Q1-e-text
9 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
34 pages
3.2 Pca
No ratings yet
3.2 Pca
27 pages
ML_Lec-20
No ratings yet
ML_Lec-20
17 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
10 pages
U4 - PCA - 5th Sem - DS
No ratings yet
U4 - PCA - 5th Sem - DS
14 pages
Principal Component Analysis (PCA)
No ratings yet
Principal Component Analysis (PCA)
18 pages
AML Unit - 1 Material
No ratings yet
AML Unit - 1 Material
36 pages
pca
No ratings yet
pca
16 pages
DimensionalitY Reduction
No ratings yet
DimensionalitY Reduction
29 pages
PCA Finds Representation Through Linear Transformation
No ratings yet
PCA Finds Representation Through Linear Transformation
28 pages
PCA
100% (1)
PCA
33 pages
IDS 4 (Week 14)
No ratings yet
IDS 4 (Week 14)
66 pages
The Math Behind PCA
No ratings yet
The Math Behind PCA
3 pages
Machine Learning (CSO851) - Lecture 03
No ratings yet
Machine Learning (CSO851) - Lecture 03
71 pages
Need of Principal Component Analysis
No ratings yet
Need of Principal Component Analysis
8 pages
Dimensionality Reduction by Pca: Non - Feasible
No ratings yet
Dimensionality Reduction by Pca: Non - Feasible
26 pages
Dimensionality Reduction (Principal Component Analysis)
No ratings yet
Dimensionality Reduction (Principal Component Analysis)
12 pages
PCA With An Example
No ratings yet
PCA With An Example
7 pages
Principal Component Analysis
100% (1)
Principal Component Analysis
9 pages
Unit 3
No ratings yet
Unit 3
102 pages
PCA_dev
No ratings yet
PCA_dev
16 pages
Pca
No ratings yet
Pca
28 pages
Feature Extraction: - Saheni Patra
No ratings yet
Feature Extraction: - Saheni Patra
17 pages
10-601 Machine Learning (Fall 2010) Principal Component Analysis
No ratings yet
10-601 Machine Learning (Fall 2010) Principal Component Analysis
8 pages
Lecture 9 -Data Prep - Reduction - PCA-M
No ratings yet
Lecture 9 -Data Prep - Reduction - PCA-M
44 pages
Principal Component Analysis (PCA) : Anisha M. Lal
No ratings yet
Principal Component Analysis (PCA) : Anisha M. Lal
20 pages
Pca
No ratings yet
Pca
18 pages
16. Principal Component Analysis
No ratings yet
16. Principal Component Analysis
27 pages
MLSP-6 dimensionality reduction
No ratings yet
MLSP-6 dimensionality reduction
39 pages
Module 2 Lab 2
No ratings yet
Module 2 Lab 2
5 pages
P-3.1.4 - Pca
No ratings yet
P-3.1.4 - Pca
44 pages
PCA S3
No ratings yet
PCA S3
26 pages
Module 5 - BECE309L - AIML - Part2
No ratings yet
Module 5 - BECE309L - AIML - Part2
34 pages
Dimension Reduction
No ratings yet
Dimension Reduction
15 pages
U5@-Data Reduction
No ratings yet
U5@-Data Reduction
22 pages
Data Pre-Processing-IV (Feature Extraction-PCA)_7c5a4c5da931f4f69a14c94e7e8b9062
No ratings yet
Data Pre-Processing-IV (Feature Extraction-PCA)_7c5a4c5da931f4f69a14c94e7e8b9062
23 pages
PCA Explained Stepbystep
No ratings yet
PCA Explained Stepbystep
4 pages
Dim Reduction & Pattern Recognition
No ratings yet
Dim Reduction & Pattern Recognition
63 pages
A COMPLETE GUIDE TO PRINCIPAL COMPONENT ANALYSIS in ML 1598272724
No ratings yet
A COMPLETE GUIDE TO PRINCIPAL COMPONENT ANALYSIS in ML 1598272724
16 pages
lec 13-14 PCA
No ratings yet
lec 13-14 PCA
53 pages
Principal Component Analysis and Cluster Analysis
No ratings yet
Principal Component Analysis and Cluster Analysis
14 pages
Kinya Sharon - Ass2 - Machine Learning
No ratings yet
Kinya Sharon - Ass2 - Machine Learning
12 pages
Mlfa Autumn 2023 Pca
No ratings yet
Mlfa Autumn 2023 Pca
32 pages
Singular Value Decomposition (SVD) / Principal Components Analysis (Pca)
No ratings yet
Singular Value Decomposition (SVD) / Principal Components Analysis (Pca)
31 pages
Principal Component Analysis
No ratings yet
Principal Component Analysis
11 pages
Principal Components Analysis (PCA)
No ratings yet
Principal Components Analysis (PCA)
27 pages
Matrices with MATLAB (Taken from "MATLAB for Beginners: A Gentle Approach")
From Everand
Matrices with MATLAB (Taken from "MATLAB for Beginners: A Gentle Approach")
Peter Kattan
3/5 (4)
Markedness (Wording Up)
No ratings yet
Markedness (Wording Up)
12 pages
Paradigms New
No ratings yet
Paradigms New
22 pages
7-Integrated Unit
No ratings yet
7-Integrated Unit
26 pages
A Literature Review On The Impact of Online Games in Learning Vocabulary
No ratings yet
A Literature Review On The Impact of Online Games in Learning Vocabulary
7 pages
Grade 8 - Electrical Quantities
No ratings yet
Grade 8 - Electrical Quantities
2 pages
Clinical Result
No ratings yet
Clinical Result
1 page
LCS 212 Assignment 2 2024 FINAL MEMO.docx
No ratings yet
LCS 212 Assignment 2 2024 FINAL MEMO.docx
5 pages
9781292054773
No ratings yet
9781292054773
8 pages
LEGO Education State of Classroom Engagement Report
No ratings yet
LEGO Education State of Classroom Engagement Report
12 pages
Unit 1 - Introduction To Translation
No ratings yet
Unit 1 - Introduction To Translation
34 pages
Instant ebooks textbook Applications of Artificial Intelligence Techniques in Engineering SIGMA 2018 Volume 1 Hasmat Malik download all chapters
100% (1)
Instant ebooks textbook Applications of Artificial Intelligence Techniques in Engineering SIGMA 2018 Volume 1 Hasmat Malik download all chapters
53 pages
CSULB AerospaceEngineering
No ratings yet
CSULB AerospaceEngineering
14 pages
Cover Bugem Key 11
No ratings yet
Cover Bugem Key 11
16 pages
Assignment Project Management ON Punjab Group of Colleges
No ratings yet
Assignment Project Management ON Punjab Group of Colleges
5 pages
GO English - Official Letter
No ratings yet
GO English - Official Letter
1 page
C202 PDF
No ratings yet
C202 PDF
4 pages
Teachers' Guide
No ratings yet
Teachers' Guide
32 pages
Issi Ka Naam Zindagi
No ratings yet
Issi Ka Naam Zindagi
346 pages
Capstone Resume
No ratings yet
Capstone Resume
2 pages
Addition Practice
No ratings yet
Addition Practice
106 pages
Tnhs Maharlika Annex Smea Dashboard
100% (1)
Tnhs Maharlika Annex Smea Dashboard
30 pages
ENSL161 Practical English I
No ratings yet
ENSL161 Practical English I
7 pages
Department of Education: Remedial Class/Remediation
No ratings yet
Department of Education: Remedial Class/Remediation
2 pages
Ayios Dhimitrios A Prehistoric Settlement In The Southwestern Peloponnese Knstantinos L Zachos pdf download
No ratings yet
Ayios Dhimitrios A Prehistoric Settlement In The Southwestern Peloponnese Knstantinos L Zachos pdf download
82 pages
Stock Prediction Using Machine Learning
No ratings yet
Stock Prediction Using Machine Learning
9 pages

2. PCA

Uploaded by

2. PCA

Uploaded by

Data Analysis

Dr. C Santhosh Kumar

In pattern recognition, Dimension Reduction is defined as-

• It is a process of converting a data set having vast dimensions into a data

• It ensures that the converted data set conveys similar information

Consider the following example:

• The following graph shows two dimensions x1 and x2.

• x1 represents the measurement of several objects in cm.

• x2 represents the measurement of several objects in inches.

• Using both these dimensions convey similar information.

• Also, they introduce a lot of noise in the system.

• So, it is better to use just one dimension.

Using dimension reduction techniques-

• It makes the data relatively easier to explain.

• Principal Component Analysis is a well-known dimension reduction technique.

• Step-01: Get data.

• Step-02: Compute the mean vector (µ).

• Step-03: Subtract mean from the given data.

• Step-04: Calculate the covariance matrix.

• Step-06: Choosing components and forming a feature vector.

• Step-07: Deriving the new data set.

1. Compute the principal component of following data-

The given feature vectors are-

Calculate the mean vector (µ).

Mean vector (µ) = ((2 + 3 + 4 + 5 + 6 + 7) / 6, (1 + 5 + 3 + 6 + 7 + 8) / 6)

Subtract mean vector (µ) from the given feature vectors.

• x1 – µ = (2 – 4.5, 1 – 5) = (-2.5, -4)

• x3 – µ = (4 – 4.5, 3 – 5) = (-0.5, -2)

On adding the above matrices and dividing by 6, we get:

λ is an eigen value for a matrix M if it is a solution of the characteristic equation |M – λI| = 0.

(2.92 – λ)(5.67 – λ) – (3.67 x 3.67) = 0

16.56 – 2.92λ – 5.67λ + λ2 – 13.47 = 0

We use the following equation to find the eigen vector-

From (1) and (2), X1 = 0.69X2

-20.9 -3.8 -8.5 4.8 11.1 17.3

• Removes Correlated Features.

• Improves Algorithm Performance.

• Data standardization is must before PCA.

You might also like