0% found this document useful (0 votes)
17 views3 pages

Regularization in Machine Learning

Regularization is a technique in machine learning that prevents overfitting by adding a penalty to model complexity, improving generalization on unseen data. Key types include L1 (Lasso) which promotes sparsity and feature selection, and L2 (Ridge) which shrinks weights without eliminating them, handling multicollinearity better. Elastic Net combines both methods, offering a balance between sparsity and coefficient size, particularly useful for datasets with correlated features.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views3 pages

Regularization in Machine Learning

Regularization is a technique in machine learning that prevents overfitting by adding a penalty to model complexity, improving generalization on unseen data. Key types include L1 (Lasso) which promotes sparsity and feature selection, and L2 (Ridge) which shrinks weights without eliminating them, handling multicollinearity better. Elastic Net combines both methods, offering a balance between sparsity and coefficient size, particularly useful for datasets with correlated features.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Regularization in Machine Learning

Introduction
Regularization is a technique used in machine learning to prevent overfitting by adding a penalty to the
model’s complexity during training. It improves the generalization performance of the model on unseen
data.

Why Regularization is Needed


Overfitting occurs when a model learns noise or irrelevant details in the training data, leading to poor
performance on test data. Regularization addresses this issue by discouraging overly complex models, striking
a balance between underfitting and overfitting.

Types of Regularization
L1 Regularization (Lasso)
L1 regularization adds the sum of the absolute values of the coefficients as a penalty to the loss function.
This encourages sparsity by shrinking some weights to exactly zero, which can effectively perform feature
selection.
n
X
Loss = Original Loss + λ |wj | (1)
j=1

Properties of L1 Regularization:
• Promotes sparsity in the model parameters, leading to a sparse solution where some weights are exactly
zero.
• Performs feature selection, as irrelevant features will have zero coefficients.
• Useful in high-dimensional datasets with many irrelevant or redundant features.
• The optimization problem is not differentiable at zero, making it slightly harder to optimize compared
to L2 regularization.

Pros and Cons of L1 Regularization:


• Pros:
– Simpler and more interpretable models due to sparsity.
– Automatic feature selection.
• Cons:
– May lead to instability when features are highly correlated.
– Does not handle multicollinearity well.

1
L2 Regularization (Ridge)
L2 regularization adds the sum of the squared values of the coefficients as a penalty to the loss function. It
encourages smaller coefficients by penalizing large weights, without driving them to exactly zero.
n
X
Loss = Original Loss + λ wj2 (2)
j=1

Properties of L2 Regularization:
• Shrinks weights towards zero but does not eliminate them (no sparsity).

• Reduces model variance, leading to more stable predictions.


• Performs well when features are correlated, as it distributes weights evenly.
• The optimization problem is smooth and differentiable, making it easier to solve.

Pros and Cons of L2 Regularization:


• Pros:

– Handles multicollinearity well.


– Leads to more stable models.
• Cons:

– Does not perform feature selection as weights are never exactly zero.
– Can result in complex models in high-dimensional settings.

Elastic Net Regularization


Elastic Net combines L1 and L2 regularization, balancing sparsity and small coefficients. It is particularly
useful when there are many correlated features.
n
X n
X
Loss = Original Loss + λ1 |wj | + λ2 wj2 (3)
j=1 j=1

Comparison of L1 and L2 Regularization


The following table highlights the key differences between L1 (Lasso) and L2 (Ridge) regularization:

Property L1 (Lasso) L2 (Ridge)


λ wj2
P P
Weight Penalty λ |wj |
Weight Shrinkage Some weights are exactly zero All weights are small but non-zero
Feature Selection Yes (Sparse model) No
Handling Multicollinearity Poor Good
Optimization Complexity Non-differentiable at zero Smooth and differentiable
Use Case High-dimensional data with sparse Low-dimensional data, correlated
features features

Table 1: Comparison of L1 and L2 Regularization

2
Tuning Regularization
The strength of regularization is controlled by a hyperparameter λ, which determines the trade-off between
model complexity and fit to the data. Higher λ leads to stronger regularization (simpler models), while lower
λ allows for more complex models.
Hyperparameter tuning can be done using:
• Cross-validation to select the optimal λ.

• Grid search or randomized search to explore possible values.

Conclusion
Regularization is essential for building machine learning models that generalize well. Ridge (L2) and Lasso
(L1) regularization are widely used techniques:
• L1 Regularization (Lasso): Promotes sparsity, useful for feature selection.
• L2 Regularization (Ridge): Shrinks weights evenly, handles multicollinearity.
Elastic Net offers a compromise between L1 and L2 regularization, making it useful for datasets with
many correlated features.

You might also like