Module in PyTorch

#python #pytorch #module #model

Buy Me a Coffee☕

*Memos:

My post explains Linear Regression in PyTorch.
My post explains Batch, Mini-Batch and Stochastic Gradient Descent with DataLoader() in PyTorch.
My post explains Batch Gradient Descent without DataLoader() in PyTorch.
My post explains how to save a model in PyTorch.
My post explains how to load the saved model which I show in this post in PyTorch.
My post explains Deep Learning Workflow in PyTorch.
My post explains how to clone a private repository with FGPAT(Fine-Grained Personal Access Token) from Github.
My post explains how to clone a private repository with PAT(Personal Access Token) from Github.
My post explains useful IPython magic commands.
My repo has models.

Module() can create a model, being its base class as shown below:

*Memos:

forward() must be overridden in the subclass of Module().
state_dict() can return the dictionary containing parameters and buffers. *It cannot get the num3 and num4 defined without Parameter().
load_state_dict() can load model's state_dict() into a model. *Basically, it's used to load a saved model into the currently used model.
parameters() can return an iterator over module parameters. *It cannot get the num3 and num4 defined without Parameter().
training can check if it's train mode or eval mode. By default, it's train mode.
train() can set a model train mode.
eval() can set a model eval mode.
cpu() can convert all model parameters and buffers to CPU.
cuda() can convert all model parameters and buffers to CUDA(GPU).
There are also save() and load(). *My post explains save() and load().

import torch
from torch import nn

class MyModel(nn.Module):
    def __init__(self):
        super().__init__()
        self.num1 = nn.Parameter(torch.tensor(9.))
        self.num2 = nn.Parameter(torch.tensor(7.))
        self.num3 = torch.tensor(-2.) # Defined without `Parameter()`
        self.num4 = torch.tensor(6.) # Defined without `Parameter()`
        self.layer1 = nn.Linear(in_features=4, out_features=5)
        self.layer2 = nn.Linear(in_features=5, out_features=2)
        self.layer3 = nn.Linear(in_features=2, out_features=3)
        self.relu = nn.ReLU()

    def forward(self, x): # Must be overridden
        x1 = self.layer1(input=x)
        x2 = self.relu(input=x1)
        x3 = self.layer2(input=x2)
        x4 = self.relu(input=x3)
        x5 = self.layer3(input=x4)
        return x5

my_tensor = torch.tensor([8., -3., 0., 1.])

torch.manual_seed(42)

mymodel = MyModel()
mymodel(x=my_tensor)
# tensor([0.8092, 0.8460, 0.3758], grad_fn=<ViewBackward0>)

mymodel
# MyModel(
#   (layer1): Linear(in_features=4, out_features=5, bias=True)
#   (layer2): Linear(in_features=5, out_features=2, bias=True)
#   (layer3): Linear(in_features=2, out_features=3, bias=True)
#   (relu): ReLU()
# )

mymodel.layer2
# Linear(in_features=5, out_features=2, bias=True)

mymodel.state_dict()
# OrderedDict([('num1', tensor(9.)),
#              ('num2', tensor(7.)),
#              ('layer1.weight',
#               tensor([[0.3823, 0.4150, -0.1171, 0.4593],
#                       [-0.1096, 0.1009, -0.2434, 0.2936],
#                       [0.4408, -0.3668, 0.4346, 0.0936],
#                       [0.3694, 0.0677, 0.2411, -0.0706],
#                       [0.3854, 0.0739, -0.2334, 0.1274]])),
#              ('layer1.bias',
#               tensor([-0.2304, -0.0586, -0.2031, 0.3317, -0.3947])),
#              ('layer2.weight',
#               tensor([[-0.2062, -0.1263, -0.2689, 0.0422, -0.4417],
#                       [0.4039, -0.3799, 0.3453, 0.0744, -0.1452]])),
#              ('layer2.bias', tensor([0.2764, 0.0697])),
#              ('layer3.weight',
#               tensor([[0.5713, 0.0773],
#                       [-0.2230, 0.1900],
#                       [-0.1918, 0.2976]])),
#              ('layer3.bias', tensor([0.6313, 0.4087, -0.3091]))])

mymodel.load_state_dict(state_dict=mymodel.state_dict())
# <All keys matched successfully>

params = mymodel.parameters()

next(params)
# Parameter containing:
# tensor(9., requires_grad=True)

next(params)
# Parameter containing:
# tensor(7., requires_grad=True)

next(params)
# Parameter containing:
# tensor([[0.3823, 0.4150, -0.1171, 0.4593],
#         [-0.1096, 0.1009, -0.2434, 0.2936],
#         [0.4408, -0.3668, 0.4346, 0.0936],
#         [0.3694, 0.0677, 0.2411, -0.0706],
#         [0.3854, 0.0739, -0.2334, 0.1274]], requires_grad=True)

next(params)
# Parameter containing:
# tensor([-0.2304, -0.0586, -0.2031, 0.3317, -0.3947],
#        requires_grad=True)

next(params)
# Parameter containing:
# tensor([[-0.2062, -0.1263, -0.2689, 0.0422, -0.4417],
#         [0.4039, -0.3799, 0.3453, 0.0744, -0.1452]],
#        requires_grad=True)

next(params)
# Parameter containing:
# tensor([0.2764, 0.0697], requires_grad=True)

next(params)
# Parameter containing:
# tensor([[0.5713, 0.0773],
#         [-0.2230, 0.1900],
#         [-0.1918, 0.2976]], requires_grad=True)

next(params)
# Parameter containing:
# tensor([0.6313, 0.4087, -0.3091], requires_grad=True)

mymodel.training
# True

mymodel.eval()

mymodel.training
# False

mymodel.train()

mymodel.training
# True

mymodel.cuda(device='cuda:0')

mymodel.layer2.weight.device
# device(type='cuda', index=0)

mymodel.cpu()

mymodel.layer2.weight.device
# device(type='cpu')