0% found this document useful (0 votes)
10 views

Chapter 1

The document outlines an intermediate course on deep learning with PyTorch, focusing on training robust models using optimizers, addressing vanishing and exploding gradients, and implementing CNNs and RNNs. It covers prerequisites such as neural network training and PyTorch basics, and introduces object-oriented programming concepts to define datasets and models. Additionally, it discusses various optimizers, model evaluation techniques, and solutions for unstable gradients, including weight initialization and batch normalization.

Uploaded by

Islem Nasri
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views

Chapter 1

The document outlines an intermediate course on deep learning with PyTorch, focusing on training robust models using optimizers, addressing vanishing and exploding gradients, and implementing CNNs and RNNs. It covers prerequisites such as neural network training and PyTorch basics, and introduces object-oriented programming concepts to define datasets and models. Additionally, it discusses various optimizers, model evaluation techniques, and solutions for unstable gradients, including weight initialization and batch normalization.

Uploaded by

Islem Nasri
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 37

PyTorch and object-

oriented
programming
I N T E R M E D I AT E D E E P L E A R N I N G W I T H P Y T O R C H

Michal Oleszak
Machine Learning Engineer
What we will learn
How to train robust deep learning models:

Improving training with optimizers

Mitigating vanishing and exploding


gradients

Convolutional Neural Networks (CNNs)

Recurrent Neural Networks (RNNs)

Multi-input and multi-output models

INTERMEDIATE DEEP LEARNING WITH PYTORCH


Prerequisites
The course assumes you are comfortable with the following topics:

Neural networks training:


Forward pass

Loss calculation

Backward pass (backpropagation)

Training models with PyTorch:


Datasets and DataLoaders

Model training loop

Model evaluation
Prerequisite course: Introduction to Deep Learning with PyTorch

INTERMEDIATE DEEP LEARNING WITH PYTORCH


Object-Oriented Programming (OOP)
We will use OOP to define:
PyTorch Datasets

PyTorch Models

In OOP, we create objects with:


Abilities (methods)

Data (attributes)

INTERMEDIATE DEEP LEARNING WITH PYTORCH


Object-Oriented Programming (OOP)
class BankAccount:
def __init__(self, balance):
self.balance = balance

__init__ is called when BankAccount object is created

balance is the attribute of the BankAccount object

account = BankAccount(100)
print(account.balance)

100

INTERMEDIATE DEEP LEARNING WITH PYTORCH


Object-Oriented Programming (OOP)
Methods: Python functions to perform tasks class BankAccount:
deposit method increases balance def __init__(self, balance):
self.balance = balance

def deposit(self, amount):


self.balance += amount

account = BankAccount(100)
account.deposit(50)
print(account.balance)

150

INTERMEDIATE DEEP LEARNING WITH PYTORCH


Water potability dataset

INTERMEDIATE DEEP LEARNING WITH PYTORCH


PyTorch Dataset
from torch.utils.data import Dataset init: load data, store as numpy array
super().__init__() ensures
class WaterDataset(Dataset):
WaterDataset behaves like torch
def __init__(self, csv_path):
super().__init__()
Dataset
df = pd.read_csv(csv_path)
len: return the size of the dataset
self.data = df.to_numpy()
getitem:
def __len__(self):
take one argument called idx
return self.data.shape[0]
return features and label for a single
def __getitem__(self, idx): sample at index idx
features = self.data[idx, :-1]
label = self.data[idx, -1]
return features, label

INTERMEDIATE DEEP LEARNING WITH PYTORCH


PyTorch DataLoader
dataset_train = WaterDataset( features, labels = next(iter(dataloader_train))
"water_train.csv" print(f"Features: {features},\nLabels: {labels}")
)

Features: tensor([
from torch.utils.data import DataLoader [0.4899, 0.4180, 0.6299, 0.3496, 0.4575,
0.3615, 0.3259, 0.5011, 0.7545],
dataloader_train = DataLoader( [0.7953, 0.6305, 0.4480, 0.6549, 0.7813,
dataset_train, 0.6566, 0.6340, 0.5493, 0.5789]
batch_size=2, ]),
shuffle=True, Labels: tensor([1., 0.])
)

INTERMEDIATE DEEP LEARNING WITH PYTORCH


PyTorch Model
Sequential model definition: Class-based model definition:

net = nn.Sequential( class Net(nn.Module):


nn.Linear(9, 16), def __init__(self):
nn.ReLU(), super().__init__()
nn.Linear(16, 8), self.fc1 = nn.Linear(9, 16)
nn.ReLU(), self.fc2 = nn.Linear(16, 8)
nn.Linear(8, 1), self.fc3 = nn.Linear(8, 1)
nn.Sigmoid(),
) def forward(self, x):
x = nn.functional.relu(self.fc1(x))
x = nn.functional.relu(self.fc2(x))
x = nn.functional.sigmoid(self.fc3(x))
return x

net = Net()

INTERMEDIATE DEEP LEARNING WITH PYTORCH


Let's practice!
I N T E R M E D I AT E D E E P L E A R N I N G W I T H P Y T O R C H
Optimizers, training,
and evaluation
I N T E R M E D I AT E D E E P L E A R N I N G W I T H P Y T O R C H

Michal Oleszak
Machine Learning Engineer
Training loop
import torch.nn as nn Define loss function and optimizer
import torch.optim as optim BCELoss for binary classification

criterion = nn.BCELoss() SGD optimizer


optimizer = optim.SGD(net.parameters(), lr=0.01)
Iterate over epochs and training batches
for epoch in range(1000): Clear gradients
for features, labels in dataloader_train:
optimizer.zero_grad() Forward pass: get model's outputs
outputs = net(features)
Compute loss
loss = criterion(
outputs, labels.view(-1, 1) Compute gradients
)
loss.backward() Optimizer's step: update params
optimizer.step()

INTERMEDIATE DEEP LEARNING WITH PYTORCH


How an optimizer works

INTERMEDIATE DEEP LEARNING WITH PYTORCH


How an optimizer works

INTERMEDIATE DEEP LEARNING WITH PYTORCH


How an optimizer works

INTERMEDIATE DEEP LEARNING WITH PYTORCH


How an optimizer works

INTERMEDIATE DEEP LEARNING WITH PYTORCH


How an optimizer works

INTERMEDIATE DEEP LEARNING WITH PYTORCH


Stochastic Gradient Descent (SGD)
optimizer = optim.SGD(net.parameters(), lr=0.01)

Update depends on learning rate

Simple and efficient, for basic models

Rarely used in practice

INTERMEDIATE DEEP LEARNING WITH PYTORCH


Adaptive Gradient (Adagrad)
optimizer = optim.Adagrad(net.parameters(), lr=0.01)

Adapts learning rate for each parameter

Good for sparse data

May decrease the learning rate too fast

INTERMEDIATE DEEP LEARNING WITH PYTORCH


Root Mean Square Propagation (RMSprop)
optimizer = optim.RMSprop(net.parameters(), lr=0.01)

Update for each parameter based on the size of its previous gradients

INTERMEDIATE DEEP LEARNING WITH PYTORCH


Adaptive Moment Estimation (Adam)
optimizer = optim.Adam(net.parameters(), lr=0.01)

Arguably the most versatile and widely used

RMSprop + gradient momentum

Often used as the go-to optimizer

INTERMEDIATE DEEP LEARNING WITH PYTORCH


Model evaluation
from torchmetrics import Accuracy Set up accuracy metric
Put model in eval mode and iterate over
acc = Accuracy(task="binary")
test data batches with no gradients
net.eval()
Pass data to model to get predicted
with torch.no_grad():
for features, labels in dataloader_test: probabilities
outputs = net(features)
Compute predicted labels
preds = (outputs >= 0.5).float()
acc(preds, labels.view(-1, 1)) Update accuracy metric

accuracy = acc.compute()
print(f"Accuracy: {accuracy}")

Accuracy: 0.6759443283081055

INTERMEDIATE DEEP LEARNING WITH PYTORCH


Let's practice!
I N T E R M E D I AT E D E E P L E A R N I N G W I T H P Y T O R C H
Vanishing and
exploding gradients
I N T E R M E D I AT E D E E P L E A R N I N G W I T H P Y T O R C H

Michal Oleszak
Machine Learning Engineer
Vanishing gradients
Gradients get smaller and smaller during
backward pass

Earlier layers get small parameter updates

Model doesn't learn

INTERMEDIATE DEEP LEARNING WITH PYTORCH


Exploding gradients
Gradients get bigger and bigger
Parameter updates are too large

Training diverges

INTERMEDIATE DEEP LEARNING WITH PYTORCH


Solution to unstable gradients
1. Proper weights initialization
2. Good activations

3. Batch normalization

INTERMEDIATE DEEP LEARNING WITH PYTORCH


Weights initialization
layer = nn.Linear(8, 1)
print(layer.weight)

Parameter containing:
tensor([[-0.0195, 0.0992, 0.0391, 0.0212,
-0.3386, -0.1892, -0.3170, 0.2148]])

INTERMEDIATE DEEP LEARNING WITH PYTORCH


Weights initialization
Good initialization ensures:

Variance of layer inputs = variance of layer outputs

Variance of gradients the same before and after a layer

How to achieve this depends on the activation:

For ReLU and similar, we can use He/Kaiming initialization

INTERMEDIATE DEEP LEARNING WITH PYTORCH


Weights initialization
import torch.nn.init as init

init.kaiming_uniform_(layer.weight)
print(layer.weight)

Parameter containing:
tensor([[-0.3063, -0.2410, 0.0588, 0.2664,
0.0502, -0.0136, 0.2274, 0.0901]])

INTERMEDIATE DEEP LEARNING WITH PYTORCH


He / Kaiming initialization
init.kaiming_uniform_(self.fc1.weight)
init.kaiming_uniform_(self.fc2.weight)
init.kaiming_uniform_(
self.fc3.weight,
nonlinearity="sigmoid",
)

INTERMEDIATE DEEP LEARNING WITH PYTORCH


He / Kaiming initialization
import torch.nn as nn
import torch.nn.init as init

class Net(nn.Module):
def __init__(self): def forward(self, x):
super().__init__() x = nn.functional.relu(self.fc1(x))
self.fc1 = nn.Linear(9, 16) x = nn.functional.relu(self.fc2(x))
self.fc2 = nn.Linear(16, 8) x = nn.functional.sigmoid(self.fc3(x))
self.fc3 = nn.Linear(8, 1) return x

init.kaiming_uniform_(self.fc1.weight)
init.kaiming_uniform_(self.fc2.weight)
init.kaiming_uniform_(
self.fc3.weight,
nonlinearity="sigmoid",
)

INTERMEDIATE DEEP LEARNING WITH PYTORCH


Activation functions

Often used as the default activation nn.functional.elu()

nn.functional.relu() Non-zero gradients for negative values -


Zero for negative inputs - dying neurons helps against dying neurons
Average output around zero - helps against
vanishing gradients

INTERMEDIATE DEEP LEARNING WITH PYTORCH


Batch normalization
After a layer:

1. Normalize the layer's outputs by:


Subtracting the mean

Dividing by the standard deviation

2. Scale and shift normalized outputs using learned parameters

Model learns optimal inputs distribution for each layer:

Faster loss decrease

Helps against unstable gradients

INTERMEDIATE DEEP LEARNING WITH PYTORCH


Batch normalization
class Net(nn.Module):
def __init__(self):
super().__init__()
self.fc1 = nn.Linear(9, 16)
self.bn1 = nn.BatchNorm1d(16)

...

def forward(self, x):


x = self.fc1(x)
x = self.bn1(x)
x = nn.functional.elu(x)

INTERMEDIATE DEEP LEARNING WITH PYTORCH


Let's practice!
I N T E R M E D I AT E D E E P L E A R N I N G W I T H P Y T O R C H

You might also like