Linear Discriminant Analysis in R Programming Last Updated : 10 Jul, 2020 Comments Improve Suggest changes Like Article Like Report One of the most popular or well established Machine Learning technique is Linear Discriminant Analysis (LDA ). It is mainly used to solve classification problems rather than supervised classification problems. It is basically a dimensionality reduction technique. Using the Linear combinations of predictors, LDA tries to predict the class of the given observations. Let us assume that the predictor variables are p. Let all the classes have an identical variant (i.e. for univariate analysis the value of p is 1) or identical covariance matrices (i.e. for multivariate analysis the value of p is greater than 1). Method of implementing LDA in R LDA or Linear Discriminant Analysis can be computed in R using the lda() function of the package MASS. LDA is used to determine group means and also for each individual, it tries to compute the probability that the individual belongs to a different group. Hence, that particular individual acquires the highest probability score in that group. To use lda() function, one must install the following packages: MASS package for lda() function.tidyverse package for better and easy data manipulation and visualization.caret package for a better machine learning workflow. On installing these packages then prepare the data. To prepare data, at first one needs to split the data into train set and test set. Then one needs to normalize the data. On doing so, automatically the categorical variables are removed. Once the data is set and prepared, one can start with Linear Discriminant Analysis using the lda() function. At first, the LDA algorithm tries to find the directions that can maximize the separation among the classes. Then it uses these directions for predicting the class of each and every individual. These directions are known as linear discriminants and are a linear combinations of the predictor variables. Explanation of the function lda() Before implementing the linear discriminant analysis, let us discuss the things to consider: One needs to inspect the univariate distributions of each and every variable. It must be normally distributed. If not, then transform using either the log and root function for exponential distribution or the Box-Cox method for skewed distribution.One needs to remove the outliers of the data and then standardize the variables in order to make the scale comparable.Let us assume that the dependent variable i.e. Y is discrete.LDA assumes that the predictors are normally distributed i.e. they come from gaussian distribution. Various classes have class specific means and equal covariance or variance. Under the MASS package, we have the lda() function for computing the linear discriminant analysis. Let's see the default method of using the lda() function. Syntax: lda(formula, data, ..., subset, na.action) Or, lda(x, grouping, prior = proportions, tol = 1.0e-4, method, CV = FALSE, nu, ...) Parameters: formula: a formula which is of the form group ~ x1+x2.. data: data frame from which we want to take the variables or individuals of the formula preferably subset: an index used to specify the cases that are to be used for training the samples. na.action: a function to specify that the action that are to be taken if NA is found. x: a matrix or a data frame required if no formula is passed in the arguments. grouping: a factor that is used to specify the classes of the observations.prior: the prior probabilities of the class membership. tol: a tolerance that is used to decide whether the matrix is singular or not. method: what kind of methods to be used in various cases. CV: if it is true then it will return the results for leave-one-out cross validation. nu: the degrees of freedom for the method when it is method="t". ...: the various arguments passed from or to other methods. The function lda() has the following elements in it's output: Prior possibilities of groups i.e. in each and every group the proportion of the training observations.Group means i.e. the group's center of gravity and is used to show in a group the mean of each and every variable.Coefficients of linear discriminants i.e the linear combination of the predictor variables which are used to form the decision rule of LDA. Example: Let us see how Linear Discriminant Analysis is computed using the lda() function. Let's use the iris data set of R Studio. r # LINEAR DISCREMINANT ANALYSIS library(MASS) library(tidyverse) library(caret) theme_set(theme_classic()) # Load the data data("iris") # Split the data into training (80%) and test set (20%) set.seed(123) training.individuals <- iris$Species %>% createDataPartition(p = 0.8, list = FALSE) train.data <- iris[training.individuals, ] test.data <- iris[-training.individuals, ] # Estimate preprocessing parameters preproc.parameter <- train.data %>% preProcess(method = c("center", "scale")) # Transform the data using the estimated parameters train.transform <- preproc.parameter %>% predict(train.data) test.transform <- preproc.parameter %>% predict(test.data) # Fit the model model <- lda(Species~., data = train.transform) # Make predictions predictions <- model %>% predict(test.transform) # Model accuracy mean(predictions$class==test.transform$Species) model <- lda(Species~., data = train.transform) model Output: [1] 1 Call: lda(Species ~ ., data = train.transformed) Prior probabilities of groups: setosa versicolor virginica 0.3333333 0.3333333 0.3333333 Group means: Sepal.Length Sepal.Width Petal.Length Petal.Width setosa -1.0120728 0.7867793 -1.2927218 -1.2496079 versicolor 0.1174121 -0.6478157 0.2724253 0.1541511 virginica 0.8946607 -0.1389636 1.0202965 1.0954568 Coefficients of linear discriminants: LD1 LD2 Sepal.Length 0.9108023 0.03183011 Sepal.Width 0.6477657 0.89852536 Petal.Length -4.0816032 -2.22724052 Petal.Width -2.3128276 2.65441936 Proportion of trace: LD1 LD2 0.9905 0.0095 Graphical plotting of the output Let's see what kind of plotting is done on two dummy data sets. For this let's use the ggplot() function in the ggplot2 package to plot the results or output obtained from the lda(). Example: r # Graphical plotting of the output library(ggplot2) library(MASS) library(mvtnorm) # Variance Covariance matrix for random bivariate gaussian sample var_covar = matrix(data = c(1.5, 0.4, 0.4, 1.5), nrow = 2) # Random bivariate Gaussian samples for class +1 Xplus1 <- rmvnorm(400, mean = c(5, 5), sigma = var_covar) # Random bivariate Gaussian samples for class -1 Xminus1 <- rmvnorm(600, mean = c(3, 3), sigma = var_covar) # Samples for the dependent variable Y_samples <- c(rep(1, 400), rep(-1, 600)) # Combining the independent and dependent variables into a dataframe dataset <- as.data.frame(cbind(rbind(Xplus1, Xminus1), Y_samples)) colnames(dataset) <- c("X1", "X2", "Y") dataset$Y <- as.character(dataset$Y) # Plot the above samples and color by class labels ggplot(data = dataset) + geom_point(aes(X1, X2, color = Y)) Output: ApplicationsIn Face Recognition System, LDA is used to generate a more reduced and manageable number of features before classifications.In Customer Identification System, LDA helps to identify and choose the features which can be used to describe the characteristics or features of a group of customers who can buy a particular item or product from a shopping mall.In field of Medical Science, LDA helps to identify the features of various diseases and classify it as mild or moderate or severe based on the patient's symptoms. Comment More infoAdvertise with us Next Article Cross-Validation in R programming S shaonim8 Follow Improve Article Tags : R Language R Machine-Learning Similar Reads Machine Learning with R Machine Learning as the name suggests is the field of study that allows computers to learn and take decisions on their own i.e. without being explicitly programmed. These decisions are based on the available data that is available through experiences or instructions. It gives the computer that makes 2 min read Getting Started With Machine Learning In RIntroduction to Machine Learning in RThe word Machine Learning was first coined by Arthur Samuel in 1959. The definition of machine learning can be defined as that machine learning gives computers the ability to learn without being explicitly programmed. Also in 1997, Tom Mitchell defined machine learning that âA computer program is sa 8 min read What is Machine Learning?Machine learning is a branch of artificial intelligence that enables algorithms to uncover hidden patterns within datasets. It allows them to predict new, similar data without explicit programming for each task. Machine learning finds applications in diverse fields such as image and speech recogniti 9 min read Setting up Environment for Machine Learning with R ProgrammingMachine Learning is a subset of Artificial Intelligence (AI), which is used to create intelligent systems that are able to learn without being programmed explicitly. In machine learning, we create algorithms and models which is used by an intelligent system to predict outcomes based on particular pa 6 min read Supervised and Unsupervised Learning in R ProgrammingArthur Samuel, a pioneer in the field of artificial intelligence and computer gaming, coined the term âMachine Learningâ. He defined machine learning as â âField of study that gives computers the capability to learn without being explicitly programmedâ. In a very layman manner, Machine Learning(ML) 8 min read Data ProcessingIntroduction to Data in Machine LearningData refers to the set of observations or measurements to train a machine learning models. The performance of such models is heavily influenced by both the quality and quantity of data available for training and testing. Machine learning algorithms cannot be trained without data. Cutting-edge develo 4 min read ML | Understanding Data ProcessingIn machine learning, data is the most important aspect, but the raw data is messy, incomplete, or unstructured. So, we process the raw data to transform it into a clean, structured format for analysis, and this step in the data science pipeline is known as data processing. Without data processing, e 5 min read ML | Overview of Data CleaningData cleaning is a important step in the machine learning (ML) pipeline as it involves identifying and removing any missing duplicate or irrelevant data. The goal of data cleaning is to ensure that the data is accurate, consistent and free of errors as raw data is often noisy, incomplete and inconsi 13 min read ML | Feature Scaling - Part 1Feature Scaling is a technique to standardize the independent features present in the data in a fixed range. It is performed during the data pre-processing. Working: Given a data-set with features- Age, Salary, BHK Apartment with the data size of 5000 people, each having these independent data featu 3 min read Supervised Learning Simple Linear Regression in RRegression shows a line or curve that passes through all the data points on the target-predictor graph in such a way that the vertical distance between the data points and the regression line is minimum What is Linear Regression?Linear Regression is a commonly used type of predictive analysis. Linea 12 min read Multiple Linear Regression using RPrerequisite: Simple Linear-Regression using RLinear Regression: It is the basic and commonly used type for predictive analysis. It is a statistical approach for modeling the relationship between a dependent variable and a given set of independent variables.These are of two types:  Simple linear Re 3 min read Decision Tree for Regression in R ProgrammingDecision tree is a type of algorithm in machine learning that uses decisions as the features to represent the result in the form of a tree-like structure. It is a common tool used to visually represent the decisions made by the algorithm. Decision trees use both classification and regression. Regres 4 min read Decision Tree Classifiers in R ProgrammingClassification is the task in which objects of several categories are categorized into their respective classes using the properties of classes. A classification model is typically used to, Predict the class label for a new unlabeled data objectProvide a descriptive model explaining what features ch 4 min read Random Forest Approach in R ProgrammingRandom Forest in R Programming is an ensemble of decision trees. It builds and combines multiple decision trees to get more accurate predictions. It's a non-linear classification algorithm. Each decision tree model is used when employed on its own. An error estimate of cases is made that is not used 4 min read Random Forest Approach for Regression in R ProgrammingRandom Forest approach is a supervised learning algorithm. It builds the multiple decision trees which are known as forest and glue them together to urge a more accurate and stable prediction. The random forest approach is similar to the ensemble technique called as Bagging. In this approach, multip 3 min read Random Forest Approach for Classification in R ProgrammingRandom forest approach is supervised nonlinear classification and regression algorithm. Classification is a process of classifying a group of datasets in categories or classes. As random forest approach can use classification or regression techniques depending upon the user and target or categories 4 min read Classifying data using Support Vector Machines(SVMs) in RSupport Vector Machines (SVM) are supervised learning models mainly used for classification and but can also be used for regression tasks. In this approach, each data point is represented as a point in an n-dimensional space, where n is the number of features. The goal is to find a hyperplane that b 5 min read Support Vector Machine Classifier Implementation in R with Caret packageOne of the most crucial aspects of machine learning that most data scientists run against in their careers is the classification problem. The goal of a classification algorithm is to foretell whether a particular activity will take place or not. Depending on the data available, classification algori 7 min read KNN Classifier in R ProgrammingK-Nearest Neighbor or KNN is a supervised non-linear classification algorithm. It is also Non-parametric in nature meaning , it doesn't make any assumption about underlying data or its distribution. Algorithm Structure In KNN algorithm, K specifies the number of neighbors and its algorithm is as fol 4 min read Evaluation MetricsPrecision, Recall and F1-Score using RIn machine learning, evaluating model performance is critical. Three widely used metricsâPrecision, Recall, and F1-Scoreâhelp assess the quality of classification models. Here's what each metric represents:Recall: Measures the proportion of actual positive cases correctly identified. Also known as s 3 min read How to Calculate F1 Score in R?In this article, we will be looking at the approach to calculate F1 Score using the various packages and their various functionalities in the R language. F1 Score The F-score or F-measure is a measure of a test's accuracy. It is calculated from the precision and recall of the test, where the precisi 5 min read Unsupervised LearningK-Means Clustering in R ProgrammingK Means Clustering in R Programming is an Unsupervised Non-linear algorithm that cluster data based on similarity or similar groups. It seeks to partition the observations into a pre-specified number of clusters. Segmentation of data takes place to assign each training example to a segment called a 3 min read Hierarchical Clustering in R ProgrammingHierarchical clustering in R Programming Language is an Unsupervised non-linear algorithm in which clusters are created such that they have a hierarchy(or a pre-determined ordering). For example, consider a family of up to three generations. A grandfather and mother have their children that become f 3 min read How to Perform Hierarchical Cluster Analysis using R Programming?Cluster analysis or clustering is a technique to find subgroups of data points within a data set. The data points belonging to the same subgroup have similar features or properties. Clustering is an unsupervised machine learning approach and has a wide variety of applications such as market research 5 min read Linear Discriminant Analysis in R ProgrammingOne of the most popular or well established Machine Learning technique is Linear Discriminant Analysis (LDA ). It is mainly used to solve classification problems rather than supervised classification problems. It is basically a dimensionality reduction technique. Using the Linear combinations of pre 6 min read Model Selection and EvaluationCross-Validation in R programmingThe major challenge in designing a machine learning model is to make it work accurately on the unseen data. To know whether the designed model is working fine or not, we have to test it against those data points which were not present during the training of the model. These data points will serve th 9 min read LOOCV (Leave One Out Cross-Validation) in R ProgrammingLOOCV (Leave-One-Out Cross-Validation) is a cross-validation technique where each individual observation in the dataset is used once as the validation set, while the remaining observations are used as the training set. This process is repeated for all observations, with each one serving as the valid 4 min read Bias-Variance Trade Off - Machine LearningIt is important to understand prediction errors (bias and variance) when it comes to accuracy in any machine-learning algorithm. There is a tradeoff between a modelâs ability to minimize bias and variance which is referred to as the best solution for selecting a value of Regularization constant. A p 3 min read Reinforcement LearningMarkov Decision ProcessMarkov Decision Process (MDP) is a way to describe how a decision-making agent like a robot or game character moves through different situations while trying to achieve a goal. MDPs rely on variables such as the environment, agentâs actions and rewards to decide the systemâs next optimal action. It 4 min read Q-Learning in Reinforcement LearningQ-Learning is a popular model-free reinforcement learning algorithm that helps an agent learn how to make the best decisions by interacting with its environment. Instead of needing a model of the environment the agent learns purely from experience by trying different actions and seeing their results 7 min read Deep Q-Learning in Reinforcement LearningDeep Q-Learning is a method that uses deep learning to help machines make decisions in complicated situations. Itâs especially useful in environments where the number of possible situations called states is very large like in video games or robotics.Before understanding Deep Q-Learning itâs importan 4 min read Dimensionality ReductionIntroduction to Dimensionality ReductionWhen working with machine learning models, datasets with too many features can cause issues like slow computation and overfitting. Dimensionality reduction helps to reduce the number of features while retaining key information. Techniques like principal component analysis (PCA), singular value decom 4 min read ML | Introduction to Kernel PCAPRINCIPAL COMPONENT ANALYSIS: is a tool which is used to reduce the dimension of the data. It allows us to reduce the dimension of the data without much loss of information. PCA reduces the dimension by finding a few orthogonal linear combinations (principal components) of the original variables wit 6 min read Principal Component Analysis with R ProgrammingPrincipal component analysis(PCA) in R programming is an analysis of the linear components of all existing attributes. Principal components are linear combinations (orthogonal transformation) of the original predictor in the dataset. It is a useful technique for EDA(Exploratory data analysis) and al 3 min read Advanced TopicsKolmogorov-Smirnov Test in R ProgrammingKolmogorov-Smirnov (K-S) test is a non-parametric test employed to check whether the probability distributions of a sample and a control distribution, or two samples are equal. It is constructed based on the cumulative distribution function (CDF) and calculates the greatest difference between the em 4 min read Moore â Penrose Pseudoinverse in R ProgrammingThe concept used to generalize the solution of a linear equation is known as Moore â Penrose Pseudoinverse of a matrix. Moore â Penrose inverse is the most widely known type of matrix pseudoinverse. In linear algebra pseudoinverse A^{+}    of a matrix A is a generalization of the inverse matrix. The 5 min read Spearman Correlation Testing in R ProgrammingCorrelation is a key statistical concept used to measure the strength and direction of the relationship between two variables. Unlike Pearsonâs correlation, which assumes a linear relationship and continuous data, Spearmanâs rank correlation coefficient is a non-parametric measure that assesses how 3 min read Poisson Functions in R ProgrammingThe Poisson distribution represents the probability of a provided number of cases happening in a set period of space or time if these cases happen with an identified constant mean rate (free of the period since the ultimate event). Poisson distribution has been named after Siméon Denis Poisson(Frenc 3 min read Feature Engineering in R ProgrammingFeature engineering is the process of transforming raw data into features that can be used in a machine-learning model. In R programming, feature engineering can be done using a variety of built-in functions and packages. One common approach to feature engineering is to use the dplyr package to mani 7 min read Adjusted Coefficient of Determination in R ProgrammingPrerequisite: Multiple Linear Regression using R A well-fitting regression model produces predicted values close to the observed data values. The mean model, which uses the mean for every predicted value, commonly would be used if there were no informative predictor variables. The fit of a proposed 3 min read Mann Whitney U Test in R ProgrammingA popular nonparametric(distribution-free) test to compare outcomes between two independent groups is the Mann Whitney U test. When comparing two independent samples, when the outcome is not normally distributed and the samples are small, a nonparametric test is appropriate. It is used to see the di 4 min read Bootstrap Confidence Interval with R ProgrammingBootstrapping is a statistical method for inference about a population using sample data. It can be used to estimate the confidence interval(CI) by drawing samples with replacement from sample data. Bootstrapping can be used to assign CI to various statistics that have no closed-form or complicated 5 min read Like