9 Types of Regression Analysis
9 Types of Regression Analysis
6 Minute Read
Once you start exploring the world of data science you realize there’s no end
to possibilities and there are numerous algorithms and techniques to train a
model depending upon different kinds of data, the data structure, and the
model output.
We can analyze data and perform data modeling using regression analysis.
Here, we create a decision boundary/line according to the data points, such
that the differences between the distances of data points from the curve or
line are minimized.
Data are essential figures that define the complete business. Regression
analysis helps to analyze the data numbers and help big firms and
businesses to make better decisions. Regression forecasting is analyzing the
relationships between data points, which can help you to peek into the
future.
y = mx + c + e
where m is the slope of the line, c is an intercept, and e represents the error
in the model.
The best-fit decision boundary is determined by varying the values of m and
c for different combinations. The difference between the observed values
and the predicted value is called a predictor error. The values of m and c get
selected to minimum predictor error.
y = b0 + b1x1
To minimize the square error we obtain the parameters b? and b? that best
fits the data after fitting the linear equation to observed data.
3) Polynomial Regression
In a polynomial regression, the power of the independent variable is more
than 1. The equation below represents a polynomial equation:
y = a + bx2
In this regression technique, the best fit line is not a straight line. It is rather
a curve that fits into the data points.
4) Logistic Regression
Logistic regression is a type of regression technique when the dependent
variable is discrete. Example: 0 or 1, true or false, etc. This means the target
variable can have only two values, and a sigmoid function shows the relation
between the target variable and the independent variable.
β = (X^{T}X + λ*I)^{-1}X^{T}y
6) Lasso Regression
Lasso Regression performs regularization along with feature selection. It
avoids the absolute size of the regression coefficient. This results in the
coefficient value getting nearer to zero, this property is different from what
in ridge regression.
Therefore we use feature selection in Lasso Regression. In the case of Lasso
Regression, only the required parameters are used, and the rest is made
zero. This helps avoid the overfitting in the model. But if independent
variables are highly collinear, then Lasso regression chooses only one
variable and makes other variables reduce to zero. Below equation
represents the Lasso Regression method:
N^{-1}Σ^{N}_{i=1}f(x_{i}, y_{I}, α, β)
Source
o Adjusted R-squared increases when a new parameter improves
the model. Low-quality parameters can decrease model
efficiency.
o Predicted R-squared is a cross-validation method that can also
decrease the model accuracy. Cross-validation partitions the
data to determine whether the model is a generic model for the
dataset.
Conclusion
The different types of regression analysis in data science and machine
learning discussed in this tutorial can be used to build the model depending
upon the structure of the training data in order to achieve optimum model
accuracy.
I hope the tutorial helps you get a clearer picture of the regression
algorithms and their application. Happy learning :)