0% found this document useful (0 votes)
14 views

Principal Component Analysis

It contains Principal Component analysis technique
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views

Principal Component Analysis

It contains Principal Component analysis technique
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 9

Principal Component Analysis (PCA)

PCA statistics is the science of analyzing all dimensions and reducing them as much as possible
while retaining precise information.

You can monitor multidimensional data (can be visualized in 2D or 3D dimensions) on any


platform using factor analysis using the Principal Component Method.

Step-by-step explanation of PCA:


- STANDARDIZATION
- COVARIANCE MATRIX COMPUTATION
- FEATURE VECTOR
- RECAST THE DATA ALONG THE PRINCIPAL COMPONENTS AXES
Applications of PCA analysis:
- In machine learning, PCA is used to visualize multidimensional data.
- Explore medical data for factors considered important in increasing the risk of any chronic
disease.
- PCA helps resize images.
Disadvantages of PCA:
Sometimes, PCA is difficult to interpret. In rare cases, even after calculating the principal
components, you may find it difficult to identify the most important features. You may encounter
some difficulties in calculating covariance and covariance matrices.

Advantages and Disadvantages of PCA:

▫️PCA and factor analysis (FA) are both dimensionality reduction techniques, but they serve
different purposes.
▫️It aims to maximize variance and is used for feature selection, while FA aims to explain the
observed variables with latent factors.

Covariance matrix:
- Used to calculate the interdependence between features or variables and also helps in reducing
it to improve performance.

PCA provides a complete explanation of the composition of variance and covariance using
multiple linear combinations of core variables. You can use PCA to analyze row dispersion and
identify properties related to the distribution.
When to use PCA?
- Whenever we need to know that our characteristics are independent of each other.
- Whenever we need less features from higher features.

Variance - used to calculate the variation in the distribution of data along the dimensions of the
graph.

Covariance - Calculates dependencies and relationships between features.

Standardizing data - Scaling our data set within a specific range so that the output is unbiased.

EigenValues and EigenVectors:

- The purpose of eigenvectors is to find the maximum variance present in the data set to calculate
the principal components. Eigenvalue refers to the size of the eigenvector.

Variance can only be used to explain the spread of data in directions parallel to the axis of the
feature space.

The largest eigenvector of the covariance matrix always points in the direction of the largest data
variance, and the size of the vector is equal to the corresponding eigenvalue. The second largest
eigenvector is always orthogonal to the largest eigenvector and points in the direction of the
second largest data diffusion.

The covariance matrix of the observed data is directly related to the linear transformation of the
white, uncorrelated data. This linear transformation is entirely defined by the eigenvectors and
eigenvalues of the data.

Principal component analysis is not only used for simple dimensionality reduction but can also
be used to identify key features and solve multicollinearity problems.

You might also like