0% found this document useful (0 votes)
58 views19 pages

Understanding Loss & Regularization in Deep Learning

The document discusses the concepts of underfitting and overfitting in deep learning, highlighting the importance of finding the right model complexity. It covers various techniques for loss functions, regularization methods, and strategies like dropout and early stopping to improve model generalization. Additionally, it provides practical tasks for experimenting with these concepts using specific datasets.

Uploaded by

rexar38710
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
58 views19 pages

Understanding Loss & Regularization in Deep Learning

The document discusses the concepts of underfitting and overfitting in deep learning, highlighting the importance of finding the right model complexity. It covers various techniques for loss functions, regularization methods, and strategies like dropout and early stopping to improve model generalization. Additionally, it provides practical tasks for experimenting with these concepts using specific datasets.

Uploaded by

rexar38710
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd

Understanding

Loss &
Regularization in
Deep Learning
Presented by:
Dr. Pandiyaraju V
Abishek Karthik
Sreya Mynampatti
What Are Underfitting &
Overfitting?
Models can either underfit (learn too little) or overfit (learn too
much and memorize). Both are issues in building reliable
models.
• Underfitting: Model is too simple to capture patterns.
• Overfitting: Model learns noise and performs poorly on
unseen data.
• Caused by improper architecture, too little or too much
training, or lack of regularization.
• Goal: Find the sweet spot — just right model complexity.
What is Underfitting?

Underfitting happens when the model cannot learn the


underlying trend of the data.
• Model is too shallow or linear for complex data.
• High bias – makes strong assumptions, ignores important
signals.
• Training and validation loss both remain high.
• Often fixed by increasing model complexity or training time.
📌 Think: "Model didn’t even try hard enough."
Why Does Overfitting
Happen?
Overfitting makes a model memorize training data rather than
learn general patterns.
• Too many parameters (deep/wide model) on small data.
• Trained too long without checks.
• Noisy or unbalanced datasets.
• Lack of regularization techniques.
📌 Think: Memorization vs Understanding.
What is Loss Function?
• The loss function tells us how wrong the model's prediction is.
It’s the core metric that we minimize during training.
• Measures the difference between predicted and actual
values.
• Helps update weights via backpropagation.
• Lower the loss → better the model performance.
📌 Example:
Mean Squared Error (MSE):
• Binary Cross-Entropy:
Types of Loss Functions
Different tasks use different loss functions depending on output
type.
• MSE – For regression problems.
• MAE (Mean Absolute Error) – Less sensitive to outliers.
• Binary Cross-Entropy – For binary classification.
• Categorical Cross-Entropy – For multi-class classification.
📌 Choose loss based on the task (regression or classification).
How Loss Drives Learning
(Backprop Recap)
Loss is used to calculate gradients and update model weights.
• Forward pass: model makes predictions.
• Compute loss between predicted and actual.
• Backward pass: gradients of loss w.r.t. weights are calculated.
• Optimizer adjusts weights to reduce loss.
What is Regularization?
Regularization is a technique to prevent overfitting and improve
generalization.
• Adds constraints or penalties to the model.
• Helps avoid learning too complex patterns or noise.
• Encourages simpler models that perform better on unseen
data.
📌 Key idea: add “discipline” to the learning process.
L1 and L2 Regularization
Both add penalties to the loss function but in different ways.
• L1 Regularization (Lasso): adds absolute value of weights

→ Encourages sparsity (some weights become zero).


• L2 Regularization (Ridge): adds squared weights

→ Shrinks weights smoothly, avoids large weights.


📌 Use L1 for feature selection, L2 for smooth generalization.
Dropout Regularization
Dropout is a simple yet powerful technique used during training.
It randomly disables neurons to avoid co-dependence.
• Forces redundancy in learning.
• Reduces risk of overfitting.
• Dropout rate = probability a neuron is turned off.
• Common in dense layers of neural networks.
Early Stopping
Sometimes, more training does more harm than good.
Early stopping halts training when performance on validation
data starts declining.
• Monitors validation loss.
• Stops training before overfitting kicks in.
• Saves compute time and avoids degrading model.
• Often paired with checkpoints (best model saving).
Batch Normalization
BatchNorm improves training speed and model stability.
It normalizes layer outputs to prevent internal covariate shifts.
• Normalizes inputs across each batch.
• Speeds up convergence.
• Slight regularization effect.
• Often placed after fully connected or conv layers.
📌 Helps with vanishing/exploding gradients.
Data Augmentation
Data augmentation generates more training data from existing
samples.
This helps generalize better to unseen inputs.
• Apply transformations: rotate, zoom, flip, shift, crop.
• Improves robustness to real-world variations.
• Common in computer vision tasks.
• Simulates unseen inputs without collecting new data.
Summary – Tackling
Overfitting
Let’s recap what we’ve learned so far about regularization
techniques.
These methods help build reliable models.
• Reduce model complexity (fewer neurons/layers).
• Add dropout in training.
• Use L1/L2 to control weights.
• Apply early stopping when val loss increases.
• Normalize inputs with BatchNorm.
• Expand data using augmentation.
📌 Combine methods for stronger generalization.
When to Use What?
There’s no one-size-fits-all — choose techniques based on your
task and data.
Here’s a rough guide:
• Small dataset → data augmentation + L2 regularization.
• Large model → dropout + L1 regularization.
• Noisy data → early stopping + robust loss (like MAE).
📌 Always watch validation metrics to avoid overfitting.
Your Takeaway
Training a deep model is not just about reducing error — it’s
about generalizing well.
A well-regularized model is both accurate and resilient.
• Don’t just memorize – learn patterns.
• Regularization is key to real-world deployment.
• Always monitor both training and validation curves.
🧠 Good models make good guesses on new data.
Code Time – Try it Yourself!
Let’s experiment with training and regularization in action!
📎
[Link]
O92O5PQLqcqmc8?usp=sharing

• Try training without regularization.


• Add L2 or dropout – observe changes in loss/accuracy.
• Use early stopping or BatchNorm and compare results.
Challenging Task
• Task 1: Image Classification with CIFAR-10 Dataset
• Train two artificial neural networks (ANNs) on the CIFAR-10 dataset using mini-
• batch gradient descent. Apply hyperparameter tuning to both models, using
• different regularization techniques for each. Evaluate the model'
• s performance
• with visualizations of loss and accuracy, and display with reasoning as to which
• model performed better.

• Task 2: Predicting House Prices with the Boston Housing Dataset


• Implement an artificial neural network (ANN) for regression on the Boston
• Housing dataset, applying minibatch gradient descent, hyperparameter tuning,
• and various regularization techniques. Assess the model using Mean Squared
• Error and visualize training progress.
• Dataset link: [Link]

You might also like