0% found this document useful (0 votes)

18 views

9 Types of Regression Analysis

The document discusses nine types of regression analysis in machine learning and data science, emphasizing their importance in predicting continuous variables based on labeled data. It explains the need for regression techniques in business decision-making and outlines how to select the appropriate regression model based on data structure and performance metrics. Each regression type, including Simple Linear, Multiple Linear, Polynomial, and others, is briefly described along with its applications and considerations.

Uploaded by

phuongntlp

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views

9 Types of Regression Analysis

Uploaded by

phuongntlp

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 16

9 Types of Regression Analysis (in

ML & Data Science)

 Dec 22, 2020

 6 Minute Read

Once you start exploring the world of data science you realize there’s no end
to possibilities and there are numerous algorithms and techniques to train a
model depending upon different kinds of data, the data structure, and the
model output.

One of the most common machine learning algorithms is regression analysis

which is a supervised learning algorithm where you train labeled data to
output continuous variables. With different types of regression algorithms,
it's important to choose the right algorithm depending on your data and the
problem your model solves. In this tutorial we will discuss different types of
regression analysis in machine learning and data science, why do we need
regression analysis, and how to choose the best algorithm according to the
data so as to get optimum model test accuracy.

So let’s get Kraken!

What is Regression Analysis?
A predictive modeling technique that evaluates the relation between
dependent (i.e. the target variable) and independent variables is known as
regression analysis. Regression analysis can be used for forecasting, time
series modeling, or finding the relation between the variables and predict
continuous values. For example, the relationship between household
locations and the power bill of the household by a driver is best studied
through regression.

We can analyze data and perform data modeling using regression analysis.
Here, we create a decision boundary/line according to the data points, such
that the differences between the distances of data points from the curve or
line are minimized.

Need for Regression techniques

The applications of regression analysis, advantages of linear regression, as
well as the benefits of regression analysis and the regression method of
forecasting can help a small business, and indeed any business, create a
better understanding of the variables (or factors) that can impact its success
in the coming weeks, months and years into the future.

Data are essential figures that define the complete business. Regression
analysis helps to analyze the data numbers and help big firms and
businesses to make better decisions. Regression forecasting is analyzing the
relationships between data points, which can help you to peek into the
future.

9 Types of Regression Analysis

The types of regression analysis that we are going to study here are:

1. Simple Linear Regression

2. Multiple Linear Regression
3. Polynomial Regression
4. Logistic Regression
5. Ridge Regression
6. Lasso Regression
7. Bayesian Linear Regression

There are some algorithms we use to train a regression model to create

predictions with continuous values.

8. Decision Tree Regression

9. Random Forest Regression

There are various different types of regression models to create predictions.

These techniques are mostly driven by three prime attributes: one the
number of independent variables, second the type of dependent variables,
and lastly the shape of the regression line.
1) Simple Linear Regression
Linear regression is the most basic form of regression algorithms in machine
learning. The model consists of a single parameter and a dependent variable
has a linear relationship. When the number of independent variables
increases, it is called the multiple linear regression models.

We denote simple linear regression by the following equation given below.

y = mx + c + e

where m is the slope of the line, c is an intercept, and e represents the error
in the model.
The best-fit decision boundary is determined by varying the values of m and
c for different combinations. The difference between the observed values
and the predicted value is called a predictor error. The values of m and c get
selected to minimum predictor error.

Points to keep in mind:

1. Note that a simple linear regression model is more susceptible to

outliers hence; it should not be used in the case of big-size data.
2. There should be a linear relationship between independent and
dependent variables.
3. There is only one independent and dependent variable.
4. The type of regression line: a best fit straight line.

2) Multiple Linear Regression

Simple linear regression allows a data scientist or data analyst to make
predictions about only one variable by training the model and predicting
another variable. In a similar way, a multiple regression model extends to
several more than one variable.
Simple linear regression uses the following linear function to predict the
value of a target variable y, with independent variable x?.

y = b0 + b1x1

To minimize the square error we obtain the parameters b? and b? that best
fits the data after fitting the linear equation to observed data.

Points to keep in mind:

1. Multiple regression shows these features multicollinearity,

autocorrelation, heteroscedasticity.
2. Multicollinearity increases the variance of the coefficient estimates and
makes the estimates very sensitive to minor changes in the model. As
a result, the coefficient estimates are unstable.
3. In the case of multiple independent variables, we can go with a forward
selection, backward elimination, and stepwise approach for feature
selection.

3) Polynomial Regression
In a polynomial regression, the power of the independent variable is more
than 1. The equation below represents a polynomial equation:

y = a + bx2

In this regression technique, the best fit line is not a straight line. It is rather
a curve that fits into the data points.

Points to keep in mind:

1. In order to fit a higher degree polynomial to get a lower error, can

result in overfitting. To plot the relationships to see the fit and focus to
make sure that the curve fits according to the nature of the problem.
Here is an example of how plotting can help:
Source

4) Logistic Regression
Logistic regression is a type of regression technique when the dependent
variable is discrete. Example: 0 or 1, true or false, etc. This means the target
variable can have only two values, and a sigmoid function shows the relation
between the target variable and the independent variable.

The logistic function is used in Logistic Regression to create a relation

between the target variable and independent variables. The below equation
denotes the logistic regression.

here p is the probability of occurrence of the feature.

5) Ridge Regression
Ridge Regression is another type of regression in machine learning and is
usually used when there is a high correlation between the parameters. This
is because as the correlation increases the least square estimates give
unbiased values. But if the collinearity is very high, there can be some bias
value. Therefore, we introduce a bias matrix in the equation of Ridge
Regression. It is a powerful regression method where the model is less
susceptible to overfitting.
Below is the equation used to denote the Ridge Regression, λ (lambda)
resolves the multicollinearity issue:

β = (X^{T}X + λ*I)^{-1}X^{T}y

6) Lasso Regression
Lasso Regression performs regularization along with feature selection. It
avoids the absolute size of the regression coefficient. This results in the
coefficient value getting nearer to zero, this property is different from what
in ridge regression.
Therefore we use feature selection in Lasso Regression. In the case of Lasso
Regression, only the required parameters are used, and the rest is made
zero. This helps avoid the overfitting in the model. But if independent
variables are highly collinear, then Lasso regression chooses only one
variable and makes other variables reduce to zero. Below equation
represents the Lasso Regression method:

N^{-1}Σ^{N}_{i=1}f(x_{i}, y_{I}, α, β)
Source

7) Bayesian Linear Regression

Bayesian Regression is used to find out the value of regression coefficients.
In Bayesian linear regression, the posterior distribution of the features is
determined instead of finding the least-squares. Bayesian Linear Regression
is a combination of Linear Regression and Ridge Regression but is more
stable than simple Linear Regression.
Now, we will learn some types of regression analysis which can be used to
train regression models to create predictions with continuous values.

8) Decision Tree Regression

The decision tree as the name suggests works on the principle of conditions.
It is efficient and has strong algorithms used for predictive analysis. It has
mainly attributed that include internal nodes, branches, and a terminal node.

Every internal node holds a “test” on an attribute, branches hold the

conclusion of the test and every leaf node means the class label. It is used
for both classifications as well as regression which are both supervised
learning algorithms. Decisions trees are extremely delicate to the
information they are prepared on — little changes to the preparation set can
bring about fundamentally different tree structures.
Source

9) Random Forest Regression

Random forest, as its name suggests, comprises an enormous amount of
individual decision trees that work as a group or as they say, an ensemble.
Every individual decision tree in the random forest lets out a class prediction
and the class with the most votes is considered as the model's prediction.

Random forest uses this by permitting every individual tree to randomly

sample from the dataset with replacement, bringing about various trees. This
is known as bagging.
How to select the right regression
model?
Each type of regression model performs differently and the model efficiency
depends on the data structure. Different types of algorithms help determine
which parameters are necessary for creating predictions. There are a few
methods to perform model selection.

1. Adjusted R-squared and predicted R-square: The models with larger

adjusted and predicted R-squared values are more efficient. These
statistics can help you avoid the fundamental problem with regular R-
squared—it always increases when you add an independent variable.
This property can lead to more complex models, which can sometimes
produce misleading results.


o Adjusted R-squared increases when a new parameter improves
the model. Low-quality parameters can decrease model
efficiency.
o Predicted R-squared is a cross-validation method that can also
decrease the model accuracy. Cross-validation partitions the
data to determine whether the model is a generic model for the
dataset.

2. P-values for the independent variables: In regression, smaller p-values

than significance level indicate that the hypothesis is statistically
significant. “Reducing the model” is the process of including all the
parameters in the model, and then repeatedly removing the term with
the highest non-significant p-value until the model contains only
significant weighted terms.
3. Stepwise regression and Best subsets regression: The two algorithms
that we discussed for automated model selection that pick the
independent variables to include in the regression equation. When we
have a huge amount of independent variables and require a variable
selection process, these automated methods can be very helpful.

Conclusion
The different types of regression analysis in data science and machine
learning discussed in this tutorial can be used to build the model depending
upon the structure of the training data in order to achieve optimum model
accuracy.

I hope the tutorial helps you get a clearer picture of the regression
algorithms and their application. Happy learning :)

Machine Learning Interview Questions
From Everand
Machine Learning Interview Questions
Tech Interviews
4.5/5 (2)
Week1 Quiz
No ratings yet
Week1 Quiz
16 pages
Toward A Better Understanding of The Relationship
No ratings yet
Toward A Better Understanding of The Relationship
26 pages
Predictive Modeling Business Report
100% (3)
Predictive Modeling Business Report
69 pages
Module 1 Notes
100% (1)
Module 1 Notes
73 pages
2 Regression Models
No ratings yet
2 Regression Models
6 pages
Different Types of Regression Models
No ratings yet
Different Types of Regression Models
18 pages
MCAN305G Machine Learning
No ratings yet
MCAN305G Machine Learning
18 pages
Lecture 2
No ratings yet
Lecture 2
17 pages
ML-U2-Regression
No ratings yet
ML-U2-Regression
20 pages
Unit - Iii Data Analysis
No ratings yet
Unit - Iii Data Analysis
39 pages
Regression: UNIT - V Regression Model
100% (1)
Regression: UNIT - V Regression Model
21 pages
Unit 2
No ratings yet
Unit 2
67 pages
4 ML
No ratings yet
4 ML
41 pages
Unit 2 Notes - Final
No ratings yet
Unit 2 Notes - Final
32 pages
ML 2 nd Unit
No ratings yet
ML 2 nd Unit
50 pages
Unit - 2 MLA
No ratings yet
Unit - 2 MLA
57 pages
Data Science
No ratings yet
Data Science
5 pages
Regression Analysis in Machine Learning
No ratings yet
Regression Analysis in Machine Learning
26 pages
Types of Regression
No ratings yet
Types of Regression
8 pages
Machine Learning: Bilal Khan
100% (2)
Machine Learning: Bilal Khan
20 pages
Ch-2 Supervised Machine Learning
No ratings yet
Ch-2 Supervised Machine Learning
48 pages
5.REGRESSION-1
No ratings yet
5.REGRESSION-1
46 pages
DA2
No ratings yet
DA2
12 pages
Module5
No ratings yet
Module5
30 pages
Regression Analysis in Machine Learning
No ratings yet
Regression Analysis in Machine Learning
9 pages
6 Regression Analysis
No ratings yet
6 Regression Analysis
12 pages
LECTURE Regression
No ratings yet
LECTURE Regression
12 pages
UNIT 3 Regression
No ratings yet
UNIT 3 Regression
5 pages
ML Unit 2
No ratings yet
ML Unit 2
27 pages
Module 5.2
No ratings yet
Module 5.2
51 pages
Supervised Learning Regression
No ratings yet
Supervised Learning Regression
15 pages
Notes 2
No ratings yet
Notes 2
22 pages
ML points
No ratings yet
ML points
13 pages
Regression
No ratings yet
Regression
4 pages
Regression Techniques
No ratings yet
Regression Techniques
14 pages
Courses Types of Regression
No ratings yet
Courses Types of Regression
7 pages
L4a - Supervised Learning
No ratings yet
L4a - Supervised Learning
25 pages
unit-3 part 2 DA
No ratings yet
unit-3 part 2 DA
20 pages
MLT Unit 2
No ratings yet
MLT Unit 2
53 pages
Unit 2 (3)
No ratings yet
Unit 2 (3)
100 pages
Supervised Learning Algorithms
No ratings yet
Supervised Learning Algorithms
20 pages
ML UNIT II
No ratings yet
ML UNIT II
30 pages
Supervised Learning
No ratings yet
Supervised Learning
24 pages
Unit - II_DA
No ratings yet
Unit - II_DA
22 pages
DOC-20240831-WA0023.
No ratings yet
DOC-20240831-WA0023.
22 pages
unit-3
No ratings yet
unit-3
30 pages
ML DL NLP Definitions
No ratings yet
ML DL NLP Definitions
22 pages
Regression Analysis in Machine Learning: Context
No ratings yet
Regression Analysis in Machine Learning: Context
16 pages
Machine learning
No ratings yet
Machine learning
62 pages
Unit_6_Machine_Learning_Algorithms
No ratings yet
Unit_6_Machine_Learning_Algorithms
13 pages
Intro Regression Modeling
No ratings yet
Intro Regression Modeling
11 pages
Unit 2
No ratings yet
Unit 2
92 pages
COMP1801 - Copy 1
No ratings yet
COMP1801 - Copy 1
18 pages
Regression Analysis: Post Mid Assignment Topic
No ratings yet
Regression Analysis: Post Mid Assignment Topic
8 pages
Machine learning notes
No ratings yet
Machine learning notes
12 pages
DA Unit-3
No ratings yet
DA Unit-3
14 pages
ML_7th_Sem_AIML_ITE_Notes_Complete_LONG[1]-34-62
No ratings yet
ML_7th_Sem_AIML_ITE_Notes_Complete_LONG[1]-34-62
29 pages
Linear Regression Analysis
No ratings yet
Linear Regression Analysis
4 pages
Regression Analysis in Machine Learning - Javatpoint
No ratings yet
Regression Analysis in Machine Learning - Javatpoint
1 page
Regression Analysis in Machine Learning
No ratings yet
Regression Analysis in Machine Learning
12 pages
Aiml - 04 - 28
No ratings yet
Aiml - 04 - 28
4 pages
Process Performance Models: Statistical, Probabilistic & Simulation
From Everand
Process Performance Models: Statistical, Probabilistic & Simulation
Vishnuvarthanan Moorthy
No ratings yet
Airline Route Planning
No ratings yet
Airline Route Planning
4 pages
Presentation-FRP-Final
No ratings yet
Presentation-FRP-Final
44 pages
PhD_Dissertation_AIR TRAFFIC CONTROLLER ACTION PREDICTION MODELS
No ratings yet
PhD_Dissertation_AIR TRAFFIC CONTROLLER ACTION PREDICTION MODELS
352 pages
Данелян Т.Я. - Теория систем и системный анализ-Евразийский открытый институт (2017)
No ratings yet
Данелян Т.Я. - Теория систем и системный анализ-Евразийский открытый институт (2017)
303 pages
First flight supported by pre-flight 4D-trajectory
No ratings yet
First flight supported by pre-flight 4D-trajectory
4 pages
Air Traffic Management Based On 4d-Trajectories Re
No ratings yet
Air Traffic Management Based On 4d-Trajectories Re
8 pages
A Probabilistic Approach To Measure Aircraft Conflict Severity
No ratings yet
A Probabilistic Approach To Measure Aircraft Conflict Severity
14 pages
Multiobjective Optimization For Aircraft Conflict Resolution-Alonso-Ayuso2016
No ratings yet
Multiobjective Optimization For Aircraft Conflict Resolution-Alonso-Ayuso2016
12 pages
The Impact of Self-Efficacy On Career Choices of G
No ratings yet
The Impact of Self-Efficacy On Career Choices of G
12 pages
DOES-BANK-LIQUIDITY-RISK-LEAD-TO-BANKS-OPERATIONAL-EFFICIENCY-A-STUDY-IN-VIETNAM
No ratings yet
DOES-BANK-LIQUIDITY-RISK-LEAD-TO-BANKS-OPERATIONAL-EFFICIENCY-A-STUDY-IN-VIETNAM
44 pages
ChatGPT
No ratings yet
ChatGPT
6 pages
A Sadiy Ah 23232023 A Jeb A 108749
No ratings yet
A Sadiy Ah 23232023 A Jeb A 108749
9 pages
Data Science Interview Preparation (30 Days of Interview Preparation)
No ratings yet
Data Science Interview Preparation (30 Days of Interview Preparation)
18 pages
Corporate Governance and Financial Performance of
No ratings yet
Corporate Governance and Financial Performance of
7 pages
T Test F Test Table
No ratings yet
T Test F Test Table
337 pages
RRL (H6 Intention)
No ratings yet
RRL (H6 Intention)
19 pages
Board Meeting Frequency and Firm Performance Examining The Nexus in Nigerian Deposit Money Banks
No ratings yet
Board Meeting Frequency and Firm Performance Examining The Nexus in Nigerian Deposit Money Banks
14 pages
Test Bank Introductory Econometrics
No ratings yet
Test Bank Introductory Econometrics
134 pages
BRM Multivariate Notes
No ratings yet
BRM Multivariate Notes
22 pages
Describe in Brief Different Types of Regression Algorithms
No ratings yet
Describe in Brief Different Types of Regression Algorithms
25 pages
The Interactive Effect of Board Monitoring and Chief Information Officer Presence On Information Technology Investment
No ratings yet
The Interactive Effect of Board Monitoring and Chief Information Officer Presence On Information Technology Investment
12 pages
(2017) An Empirical Model For Prediction of Household Solid Waste Generation Rate - A Case Study of Dhanbad, India
No ratings yet
(2017) An Empirical Model For Prediction of Household Solid Waste Generation Rate - A Case Study of Dhanbad, India
13 pages
Family Influences On Career Decision-Making Self-Efficacy of Chinese Secondary Vocational Students
No ratings yet
Family Influences On Career Decision-Making Self-Efficacy of Chinese Secondary Vocational Students
20 pages
The Factors Affecting Writing Reviews in Hotel Websites: 8 International Strategic Management Conference
No ratings yet
The Factors Affecting Writing Reviews in Hotel Websites: 8 International Strategic Management Conference
7 pages
Lessons in Business Statistics Prepared by P.K. Viswanathan
No ratings yet
Lessons in Business Statistics Prepared by P.K. Viswanathan
52 pages
Determinant of Cashless in Malaysia
No ratings yet
Determinant of Cashless in Malaysia
11 pages
Workplace Nepotism and Employees Job Satifaction
No ratings yet
Workplace Nepotism and Employees Job Satifaction
10 pages
Ioannis Lazar The Relationship Between Working Capital Management and Profitability of Listed Companies in The Athens Stock Exchange
No ratings yet
Ioannis Lazar The Relationship Between Working Capital Management and Profitability of Listed Companies in The Athens Stock Exchange
12 pages
Relation of HRM With Customer Satisfaction
No ratings yet
Relation of HRM With Customer Satisfaction
11 pages
Relationship Between Employee Mental Health and Job Performance Mediation Role of Innovative Behavior and Work Engagement
No ratings yet
Relationship Between Employee Mental Health and Job Performance Mediation Role of Innovative Behavior and Work Engagement
12 pages
Multivariate Regression
No ratings yet
Multivariate Regression
20 pages
100-Machine-Learning-Interview-Questions-and-Answers (Downloaded From Internet)
No ratings yet
100-Machine-Learning-Interview-Questions-and-Answers (Downloaded From Internet)
24 pages
The Impact of Intrinsic Motivation On Employees J
No ratings yet
The Impact of Intrinsic Motivation On Employees J
10 pages
The Influence of Brand Image, Design, Feature, and Price On Purchasing Decision of Apple iOS Smartphone in Surakarta, Indonesia
No ratings yet
The Influence of Brand Image, Design, Feature, and Price On Purchasing Decision of Apple iOS Smartphone in Surakarta, Indonesia
5 pages
The Influence of Online Customer Reviews and E-Service Quality On Buying Decisions in Electronic Commerce
No ratings yet
The Influence of Online Customer Reviews and E-Service Quality On Buying Decisions in Electronic Commerce
16 pages

9 Types of Regression Analysis

Uploaded by

9 Types of Regression Analysis

Uploaded by

9 Types of Regression Analysis (in

ML & Data Science)

One of the most common machine learning algorithms is regression analysis

So let’s get Kraken!

Need for Regression techniques

9 Types of Regression Analysis

1. Simple Linear Regression

There are some algorithms we use to train a regression model to create

8. Decision Tree Regression

There are various different types of regression models to create predictions.

We denote simple linear regression by the following equation given below.

Points to keep in mind:

1. Note that a simple linear regression model is more susceptible to

2) Multiple Linear Regression

Points to keep in mind:

1. Multiple regression shows these features multicollinearity,

Points to keep in mind:

1. In order to fit a higher degree polynomial to get a lower error, can

The logistic function is used in Logistic Regression to create a relation

here p is the probability of occurrence of the feature.

7) Bayesian Linear Regression

8) Decision Tree Regression

Every internal node holds a “test” on an attribute, branches hold the

9) Random Forest Regression

Random forest uses this by permitting every individual tree to randomly

1. Adjusted R-squared and predicted R-square: The models with larger

2. P-values for the independent variables: In regression, smaller p-values

You might also like