Principal Component analysis:
PCA stands for Principal Component Analysis. It is a dimensionality reduction technique
commonly used in data analysis and machine learning. The primary goal of PCA is to reduce the
dimensionality of a dataset while preserving as much of the variance or information present in
the data as possible.
PCA achieves this by transforming the original variables into a new set of variables, called
principal components. These principal components are linear combinations of the original
variables and are orthogonal to each other, meaning they are uncorrelated. The first principal
component accounts for the largest possible variance in the data, the second principal component
for the second largest variance, and so on.
In essence, PCA helps in simplifying the complexity of high-dimensional data by capturing the
most important patterns or directions of variation in the data, thereby enabling easier
visualization, exploration, and analysis of the dataset. It is widely used in various fields such as
image processing, signal processing, finance, and bioinformatics, among others.
The principal components (PCs) in PCA are derived through linear algebra techniques, primarily
involving eigenvalue decomposition or singular value decomposition (SVD) of the covariance
matrix of the original data. Here's a brief overview of the mathematics behind PCA:
1. Centering the data: First, the mean of each feature (variable) is subtracted from the dataset.
This step ensures that the data is centered around the origin.
2. Covariance matrix: The covariance matrix is calculated for the centered data. This matrix
represents the pairwise covariances between all pairs of features.
3. Eigenvalue decomposition (EVD):
EVD: In EVD, the covariance matrix is decomposed into its eigenvectors and
eigenvalues. The eigenvectors represent the directions (principal components) of
maximum variance in the data, and the corresponding eigenvalues represent the
magnitude of variance along those directions. The eigenvectors are usually sorted in
descending order based on their corresponding eigenvalues, so the first principal
component (PC1) captures the most variance; the second principal component (PC2)
captures the second most variance, and so on.
4. Selecting principal components: After obtaining the eigenvectors or singular vectors, the
desired number of principal components is selected based on the explained variance or the
application's requirements. Typically, one can select a subset of the principal components that
capture most of the variance in the data.
5. Projection: Finally, the original data is projected onto the selected principal components to
obtain the reduced-dimensional representation of the data. This is achieved by taking the dot
product of the centered data matrix with the matrix of selected principal components.
Numerical:
To compute PCA, following steps:
1. Center the data
Linear Discriminant analysis:
Linear Discriminant Analysis (LDA) is a dimensionality reduction technique and a classification
algorithm used in machine learning and statistics. Linear Discriminant Analysis (LDA) is
primarily used in supervised learning tasks.
The main goal of LDA is to find a linear combination of features that characterizes or separates
two or more classes of objects or events. It's particularly useful when dealing with classification
problems where the classes are well-separated. LDA seeks to project the feature space onto a
lower-dimensional space while preserving the class discriminatory information as much as
possible.
Here's how LDA works:
1. Calculate the mean vectors: For each class in the dataset, calculate the mean vector, which
represents the mean values of each feature for that class.
2. Compute the scatter matrices: There are two scatter matrices used in LDA:
Within-class scatter matrix (Sw): It measures the spread of data within individual classes.
Between-class scatter matrix (Sb): It measures the spread between different classes.
3. Compute the eigenvectors and eigenvalues: Next, compute the eigenvectors and eigenvalues of
the matrix (Sw^(-1)) * Sb. The eigenvectors represent the directions (linear discriminants) that
maximize the separation between classes, while the eigenvalues represent the amount of variance
explained by each eigenvector.
4. Select discriminants: Sort the eigenvectors by their corresponding eigenvalues in descending
order and choose the top k eigenvectors to form a matrix W. These eigenvectors serve as the axes
for the new feature subspace.
5. Project the data onto the new feature subspace: Multiply the original data matrix by the
matrix W to obtain the new feature subspace.
6. Classification: Once the data is projected onto the new feature subspace, a classification
algorithm (e.g., nearest neighbor classifier, logistic regression) can be applied to classify the
data.
LDA is widely used in various fields, including pattern recognition, face recognition,
bioinformatics, and finance, among others. It's especially effective when the classes are well-
separated and the assumptions of normality and equal covariance matrices hold true.