0% found this document useful (0 votes)
17 views96 pages

Lecture 13 RNN Classifier

Uploaded by

rumw He
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views96 pages

Lecture 13 RNN Classifier

Uploaded by

rumw He
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 96

PyTorch Tutorial

13. RNN Classifier

Lecturer : Hongpu Liu Lecture 13-1 PyTorch Tutorial @ SLAM Research Group
RNN Classifier – Name Classification

Name Country
Maclean English
Vajnichy Russian
Nasikovsky Russian We shall train on a few thousand
Usami Japanese surnames from 18 languages of origin,
Fionin Russian and predict which language a name is

Sharkey English from based on the spelling.

Balagul Russian
Pakhrin Russian
Tansho Japanese

Lecturer : Hongpu Liu Lecture 13-2 PyTorch Tutorial @ SLAM Research Group
Revision

𝑜1 𝑜2 𝑜3 𝑜4 𝑜5

Linear Linear Linear Linear Linear


Layer Layer Layer Layer Layer

ℎ0 RNN Cell RNN Cell RNN Cell RNN Cell RNN Cell ℎ5

Embed Embed Embed Embed Embed

𝑥1 𝑥2 𝑥3 𝑥4 𝑥𝑁

Lecturer : Hongpu Liu Lecture 13-3 PyTorch Tutorial @ SLAM Research Group
Our Model

Linear
Layer

ℎ0 RNN Cell RNN Cell RNN Cell RNN Cell ℎ𝑁

Embed Embed Embed Embed

𝑥1 𝑥2 𝑥3 𝑥𝑁

Lecturer : Hongpu Liu Lecture 13-4 PyTorch Tutorial @ SLAM Research Group
Our Model

ℎ0

Embedding ℎ𝑁 Linear
𝒙 GRU Layer 𝑜
Layer Layer

Name Country
Maclean English
Vajnichy Russian
Nasikovsky Russian
Usami Japanese
Fionin Russian
Sharkey English
Balagul Russian
Pakhrin Russian
Tansho Japanese

Lecturer : Hongpu Liu Lecture 13-5 PyTorch Tutorial @ SLAM Research Group
Our Model

ℎ0

Embedding ℎ𝑁 Linear
𝒙 GRU Layer 𝑜
Layer Layer

Name Country
Maclean English
Vajnichy Russian
Nasikovsky Russian
Usami Japanese
Fionin Russian
Sharkey English
Balagul Russian
Pakhrin Russian
Tansho Japanese

Lecturer : Hongpu Liu Lecture 13-6 PyTorch Tutorial @ SLAM Research Group
Implementation – Main Cycle

if __name__ == '__main__':
classifier = RNNClassifier(N_CHARS, HIDDEN_SIZE, N_COUNTRY, N_LAYER)
if USE_GPU:
device = torch.device("cuda:0") Instantiate the classifier
classifier.to(device)
model.
criterion = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(classifier.parameters(), lr=0.001)

start = time.time()
print("Training for %d epochs..." % N_EPOCHS)
acc_list = []
for epoch in range(1, N_EPOCHS + 1):
# Train cycle
trainModel()
acc = testModel()
acc_list.append(acc)

Lecturer : Hongpu Liu Lecture 13-7 PyTorch Tutorial @ SLAM Research Group
Implementation – Main Cycle

if __name__ == '__main__':
classifier = RNNClassifier(N_CHARS, HIDDEN_SIZE, N_COUNTRY, N_LAYER)
if USE_GPU:
device = torch.device("cuda:0")
Whether use GPU for
classifier.to(device)
training model.
criterion = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(classifier.parameters(), lr=0.001)

start = time.time()
print("Training for %d epochs..." % N_EPOCHS)
acc_list = []
for epoch in range(1, N_EPOCHS + 1):
# Train cycle
trainModel()
acc = testModel()
acc_list.append(acc)

Lecturer : Hongpu Liu Lecture 13-8 PyTorch Tutorial @ SLAM Research Group
Implementation – Main Cycle

if __name__ == '__main__':
classifier = RNNClassifier(N_CHARS, HIDDEN_SIZE, N_COUNTRY, N_LAYER)
if USE_GPU:
device = torch.device("cuda:0")
classifier.to(device)
Using cross entropy loss
criterion = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(classifier.parameters(), lr=0.001) as loss function.
Using Adam optimizer.
start = time.time()
print("Training for %d epochs..." % N_EPOCHS)
acc_list = []
for epoch in range(1, N_EPOCHS + 1):
# Train cycle
trainModel()
acc = testModel()
acc_list.append(acc)

Lecturer : Hongpu Liu Lecture 13-9 PyTorch Tutorial @ SLAM Research Group
Implementation – Main Cycle

if __name__ == '__main__':
classifier = RNNClassifier(N_CHARS, HIDDEN_SIZE, N_COUNTRY, N_LAYER)
if USE_GPU:
device = torch.device("cuda:0")
For printing elapsed
classifier.to(device)
time.
criterion = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(classifier.parameters(), lr=0.001)

start = time.time()
print("Training for %d epochs..." % N_EPOCHS)
def time_since(since):
acc_list = []
s = time.time() - since
for epoch in range(1, N_EPOCHS + 1):
m = math.floor(s / 60)
# Train cycle
s -= m * 60
trainModel()
return '%dm %ds' % (m, s)
acc = testModel()
acc_list.append(acc)

Lecturer : Hongpu Liu Lecture 13-10 PyTorch Tutorial @ SLAM Research Group
Implementation – Main Cycle

if __name__ == '__main__':
classifier = RNNClassifier(N_CHARS, HIDDEN_SIZE, N_COUNTRY, N_LAYER)
if USE_GPU:
device = torch.device("cuda:0")
classifier.to(device)

criterion = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(classifier.parameters(), lr=0.001)

start = time.time()
print("Training for %d epochs..." % N_EPOCHS)
acc_list = []
for epoch in range(1, N_EPOCHS + 1): In every epoch, training
# Train cycle and testing the model
trainModel()
acc = testModel() once.
acc_list.append(acc)

Lecturer : Hongpu Liu Lecture 13-11 PyTorch Tutorial @ SLAM Research Group
Implementation – Main Cycle

if __name__ == '__main__':
classifier = RNNClassifier(N_CHARS, HIDDEN_SIZE, N_COUNTRY, N_LAYER)
if USE_GPU:
device = torch.device("cuda:0")
Recording the accuracy
classifier.to(device)
of testing.
criterion = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(classifier.parameters(), lr=0.001)
import matplotlib.pyplot as plt
start = time.time() import numpy as np
print("Training for %d epochs..." % N_EPOCHS)
acc_list = [] epoch = np.arange(1, len(acc_list) + 1, 1)
for epoch in range(1, N_EPOCHS + 1): acc_list = np.array(acc_list)
# Train cycle plt.plot(epoch, acc_list)
trainModel() plt.xlabel('Epoch')
acc = testModel() plt.ylabel('Accuracy')
acc_list.append(acc) plt.grid()
plt.show()

Lecturer : Hongpu Liu Lecture 13-12 PyTorch Tutorial @ SLAM Research Group
Implementation – Preparing Data

Name Characters ASCII

Maclean ['M', 'a', 'c', 'l', 'e', 'a', 'n'] [ 77 97 99 108 101 97 110]

Vajnichy ['V', 'a', 'j', 'n', 'i', 'c', 'h', 'y'] [ 86 97 106 110 105 99 104 121]

Nasikovsky ['N', 'a', 's', 'i', 'k', 'o', 'v', 's', 'k', 'y'] [ 78 97 115 105 107 111 118 115 107 121]

Usami ['U', 's', 'a', 'm', 'i'] [ 85 115 97 109 105]

Fionin ['F', 'i', 'o', 'n', 'i', 'n'] [ 70 105 111 110 105 110]

Sharkey ['S', 'h', 'a', 'r', 'k', 'e', 'y'] [ 83 104 97 114 107 101 121]

Balagul ['B', 'a', 'l', 'a', 'g', 'u', 'l'] [ 66 97 108 97 103 117 108]

Pakhrin ['P', 'a', 'k', 'h', 'r', 'i', 'n'] [ 80 97 107 104 114 105 110]

Tansho ['T', 'a', 'n', 's', 'h', 'o'] [ 84 97 110 115 104 111]

Lecturer : Hongpu Liu Lecture 13-13 PyTorch Tutorial @ SLAM Research Group
Implementation – Preparing Data

ASCII After padding

[ 77 97 99 108 101 97 110] [ 77 97 99 108 101 97 110 0 0 0]

[ 86 97 106 110 105 99 104 121] [ 86 97 106 110 105 99 104 121 0 0]

[ 78 97 115 105 107 111 118 115 107 121] [ 78 97 115 105 107 111 118 115 107 121]

[ 85 115 97 109 105] [ 85 115 97 109 105 0 0 0 0 0]

[ 70 105 111 110 105 110] [ 70 105 111 110 105 110 0 0 0 0]

[ 83 104 97 114 107 101 121] [ 83 104 97 114 107 101 121 0 0 0]

[ 66 97 108 97 103 117 108] [ 66 97 108 97 103 117 108 0 0 0]

[ 80 97 107 104 114 105 110] [ 80 97 107 104 114 105 110 0 0 0]

[ 84 97 110 115 104 111] [ 84 97 110 115 104 111 0 0 0 0]

Lecturer : Hongpu Liu Lecture 13-14 PyTorch Tutorial @ SLAM Research Group
Implementation – Preparing Data

Country Index Country Index


Arabic 0 Chinese 1
Czech 2 Dutch 3
English 4 French 5
German 6 Greek 7
Irish 8 Italian 9
Japanese 10 Korean 11
Polish 12 Portuguese 13
Russian 14 Scottish 15
Spanish 16 Vietnamese 17

Lecturer : Hongpu Liu Lecture 13-15 PyTorch Tutorial @ SLAM Research Group
Implementation – Preparing Data

class NameDataset(Dataset):
def __init__(self, is_train_set=True):
filename = 'data/names_train.csv.gz' if is_train_set else 'data/names_test.csv.gz'
with gzip.open(filename, 'rt') as f:
Reading data from .gz reader = csv.reader(f)
file with package gzip rows = list(reader)
self.names = [row[0] for row in rows]
and csv. self.len = len(self.names)
self.countries = [row[1] for row in rows]
import gzip self.country_list = list(sorted(set(self.countries)))
self.country_dict = self.getCountryDict()
import csv
self.country_num = len(self.country_list)

def __getitem__(self, index):


return self.names[index], self.country_dict[self.countries[index]]

def __len__(self):
return self.len

Lecturer : Hongpu Liu Lecture 13-16 PyTorch Tutorial @ SLAM Research Group
Implementation – Preparing Data

class NameDataset(Dataset):
def __init__(self, is_train_set=True):
filename = 'data/names_train.csv.gz' if is_train_set else 'data/names_test.csv.gz'
with gzip.open(filename, 'rt') as f:
Name Country
reader = csv.reader(f)
Maclean English
rows = list(reader)
Vajnichy Russian
self.names = [row[0] for row in rows]
Save names and Nasikovsky Russian
self.len = len(self.names)
countries in list. self.countries = [row[1] for row in rows] Usami Japanese

self.country_list = list(sorted(set(self.countries))) Fionin Russian

self.country_dict = self.getCountryDict() Sharkey English

self.country_num = len(self.country_list) Balagul Russian


Pakhrin Russian

def __getitem__(self, index): Tansho Japanese

return self.names[index], self.country_dict[self.countries[index]]

def __len__(self):
return self.len

Lecturer : Hongpu Liu Lecture 13-17 PyTorch Tutorial @ SLAM Research Group
Implementation – Preparing Data

Country Index Country Index


class NameDataset(Dataset):
Arabic 0 Chinese 1
def __init__(self, is_train_set=True): Czech 2 Dutch 3
filename = 'data/names_train.csv.gz' if is_train_set else
English
'data/names_test.csv.gz'
4 French 5
with gzip.open(filename, 'rt') as f: German 6 Greek 7
reader = csv.reader(f) Irish 8 Italian 9
rows = list(reader) Japanese 10 Korean 11
self.names = [row[0] for row in rows] Polish 12 Portuguese 13
self.len = len(self.names) Russian 14 Scottish 15

self.countries = [row[1] for row in rows]


Save countries and its
Spanish 16 Vietnamese 17

self.country_list = list(sorted(set(self.countries)))
index in list and self.country_dict = self.getCountryDict()
self.country_num = len(self.country_list)
dictionary.
def __getitem__(self, index):
return self.names[index], self.country_dict[self.countries[index]]

def __len__(self):
return self.len

Lecturer : Hongpu Liu Lecture 13-18 PyTorch Tutorial @ SLAM Research Group
Implementation – Preparing Data

Name Country Country Index Country Index


class NameDataset(Dataset):
Maclean English Arabic 0 Chinese 1
def __init__(self, is_train_set=True):
Vajnichy Russian Czech 2 Dutch 3
filename = 'data/names_train.csv.gz'
Nasikovsky
if
Russian
is_train_set else
English
'data/names_test.csv.gz'
4 French 5
with gzip.open(filename, 'rt')
Usami as f:
Japanese German 6 Greek 7
reader = csv.reader(f)Fionin Russian Irish 8 Italian 9
rows = list(reader) Sharkey English Japanese 10 Korean 11
self.names = [row[0] for Balagul
row in rows] Russian Polish 12 Portuguese 13
self.len = len(self.names)
Pakhrin Russian Russian 14 Scottish 15

self.countries = [row[1] Tansho


for row inJapanese
rows] Spanish 16 Vietnamese 17

self.country_list = list(sorted(set(self.countries)))
self.country_dict = self.getCountryDict()
self.country_num = len(self.country_list)
Save countries and its
def __getitem__(self, index):
index in list and return self.names[index], self.country_dict[self.countries[index]]
dictionary.
def __len__(self):
return self.len

Lecturer : Hongpu Liu Lecture 13-19 PyTorch Tutorial @ SLAM Research Group
Implementation – Preparing Data

Name Country Country Index Country Index


class NameDataset(Dataset):
Maclean English Arabic 0 Chinese 1
def __init__(self, is_train_set=True):
Vajnichy Russian Czech 2 Dutch 3
filename = 'data/names_train.csv.gz'
Nasikovsky
if
Russian
is_train_set else
English
'data/names_test.csv.gz'
4 French 5
with gzip.open(filename, 'rt')
Usami as f:
Japanese German 6 Greek 7
reader = csv.reader(f)Fionin Russian Irish 8 Italian 9
rows = list(reader) Sharkey English Japanese 10 Korean 11
self.names = [row[0] for Balagul
row in rows] Russian Polish 12 Portuguese 13
self.len = len(self.names)
Pakhrin Russian Russian 14 Scottish 15

self.countries = [row[1] Tansho


for row inJapanese
rows] Spanish 16 Vietnamese 17

self.country_list = list(sorted(set(self.countries)))
self.country_dict = self.getCountryDict()
self.country_num = len(self.country_list)
Save countries and its
def __getitem__(self, index):
index in list and return self.names[index], self.country_dict[self.countries[index]]
dictionary.
def __len__(self):
return self.len

Lecturer : Hongpu Liu Lecture 13-20 PyTorch Tutorial @ SLAM Research Group
Implementation – Preparing Data

class NameDataset(Dataset):
def __init__(self, is_train_set=True):
filename = 'data/names_train.csv.gz' if is_train_set else 'data/names_test.csv.gz'
with gzip.open(filename, 'rt') as f:
reader = csv.reader(f)
rows = list(reader)
self.names = [row[0] for row in rows]
self.len = len(self.names)
self.countries = [row[1] for row in rows]
self.country_list = list(sorted(set(self.countries)))
self.country_dict = self.getCountryDict()
self.country_num = len(self.country_list)

def __getitem__(self, index):


return self.names[index], self.country_dict[self.countries[index]]

Return length of def __len__(self):


return self.len
dataset.

Lecturer : Hongpu Liu Lecture 13-21 PyTorch Tutorial @ SLAM Research Group
Implementation – Preparing Data

Country Index Country Index


Arabic 0 Chinese 1
Czech 2 Dutch 3 class NameDataset(Dataset):
English 4 French 5
German 6 Greek 7
……
Irish 8 Italian 9
Japanese 10 Korean 11
def getCountryDict(self):
Polish 12 Portuguese 13
Russian Convert
14 list into
Scottish 15
country_dict = dict()
Spanish 16 Vietnamese 17
for idx, country_name in enumerate(self.country_list, 0):
dictionary. country_dict[country_name] = idx
return country_dict

def idx2country(self, index):


return self.country_list[index]

def getCountriesNum(self):
return self.country_num

Lecturer : Hongpu Liu Lecture 13-22 PyTorch Tutorial @ SLAM Research Group
Implementation – Preparing Data

Country Index Country Index


Arabic 0 Chinese 1
Czech 2 Dutch 3 class NameDataset(Dataset):
English 4 French 5
German 6 Greek 7
……
Irish 8 Italian 9
Japanese 10 Korean 11
def getCountryDict(self):
Polish 12 Portuguese 13
country_dict = dict()
Russian 14 Scottish 15
Spanish 16 Vietnamese 17
for idx, country_name in enumerate(self.country_list, 0):
country_dict[country_name] = idx
return country_dict

Return country name def idx2country(self, index):


return self.country_list[index]
giving index.
def getCountriesNum(self):
return self.country_num

Lecturer : Hongpu Liu Lecture 13-23 PyTorch Tutorial @ SLAM Research Group
Implementation – Preparing Data

Country Index Country Index


Arabic 0 Chinese 1
Czech 2 Dutch 3 class NameDataset(Dataset):
English 4 French 5
German 6 Greek 7
……
Irish 8 Italian 9
Japanese 10 Korean 11
def getCountryDict(self):
Polish 12 Portuguese 13
country_dict = dict()
Russian 14 Scottish 15
Spanish 16 Vietnamese 17
for idx, country_name in enumerate(self.country_list, 0):
country_dict[country_name] = idx
return country_dict

def idx2country(self, index):


return self.country_list[index]

Return the number of def getCountriesNum(self):


return self.country_num
countries.

Lecturer : Hongpu Liu Lecture 13-24 PyTorch Tutorial @ SLAM Research Group
Implementation – Preparing Data

trainset = NameDataset(is_train_set=True)
trainloader = DataLoader(trainset, batch_size=BATCH_SIZE, shuffle=True)
testset = NameDataset(is_train_set=False)
testloader = DataLoader(testset, batch_size=BATCH_SIZE, shuffle=False)

N_COUNTRY = trainset.getCountriesNum()

Prepare Dataset and DataLoader

Lecturer : Hongpu Liu Lecture 13-25 PyTorch Tutorial @ SLAM Research Group
Implementation – Preparing Data

trainset = NameDataset(is_train_set=True)
trainloader = DataLoader(trainset, batch_size=BATCH_SIZE, shuffle=True)
testset = NameDataset(is_train_set=False)
testloader = DataLoader(testset, batch_size=BATCH_SIZE, shuffle=False)

N_COUNTRY = trainset.getCountriesNum()

N_COUNTRY is the output size of our model.

Lecturer : Hongpu Liu Lecture 13-26 PyTorch Tutorial @ SLAM Research Group
Implementation – Preparing Data

# Parameters
HIDDEN_SIZE = 100
BATCH_SIZE = 256
N_LAYER = 2
N_EPOCHS = 100
trainset = NameDataset(is_train_set=True) N_CHARS = 128
trainloader = DataLoader(trainset, batch_size=BATCH_SIZE, shuffle=True)
USE_GPU = False
testset = NameDataset(is_train_set=False)
testloader = DataLoader(testset, batch_size=BATCH_SIZE, shuffle=False)

N_COUNTRY = trainset.getCountriesNum()

N_COUNTRY is the output size of our model.

Lecturer : Hongpu Liu Lecture 13-27 PyTorch Tutorial @ SLAM Research Group
Implementation – Model Design

class RNNClassifier(torch.nn.Module):
def __init__(self, input_size, hidden_size, output_size, n_layers=1, bidirectional=True):
super(RNNClassifier, self).__init__()
self.hidden_size = hidden_size
self.n_layers = n_layers
self.n_directions = 2 if bidirectional else 1

self.embedding = torch.nn.Embedding(input_size, hidden_size)


self.gru = torch.nn.GRU(hidden_size, hidden_size, n_layers,
bidirectional=bidirectional)
self.fc = torch.nn.Linear(hidden_size * self.n_directions, output_size)

def _init_hidden(self, batch_size):


hidden = torch.zeros(self.n_layers * self.n_directions,
batch_size, self.hidden_size)
return create_tensor(hidden)

Lecturer : Hongpu Liu Lecture 13-28 PyTorch Tutorial @ SLAM Research Group
Implementation – Model Design

class RNNClassifier(torch.nn.Module):
def __init__(self, input_size, hidden_size, output_size, n_layers=1, bidirectional=True):
super(RNNClassifier, self).__init__()
self.hidden_size = hidden_size Parameters of GRU layer.
self.n_layers = n_layers
self.n_directions = 2 if bidirectional else 1

self.embedding = torch.nn.Embedding(input_size, hidden_size)


self.gru = torch.nn.GRU(hidden_size, hidden_size, n_layers,
bidirectional=bidirectional)
self.fc = torch.nn.Linear(hidden_size * self.n_directions, output_size)

def _init_hidden(self, batch_size):


hidden = torch.zeros(self.n_layers * self.n_directions,
batch_size, self.hidden_size)
return create_tensor(hidden)

Lecturer : Hongpu Liu Lecture 13-29 PyTorch Tutorial @ SLAM Research Group
Implementation – Model Design

class RNNClassifier(torch.nn.Module): The input of Embedding Layer with shape:


def __init__(self, input_size, hidden_size, output_size, n_layers=1, bidirectional=True):
super(RNNClassifier, self).__init__() 𝑠𝑒𝑞𝐿𝑒𝑛, 𝑏𝑎𝑡𝑐ℎ𝑆𝑖𝑧𝑒
self.hidden_size = hidden_size The output of Embedding Layer with shape:
self.n_layers = n_layers
self.n_directions = 2 if bidirectional else 1 𝑠𝑒𝑞𝐿𝑒𝑛, 𝑏𝑎𝑡𝑐ℎ𝑆𝑖𝑧𝑒, ℎ𝑖𝑑𝑑𝑒𝑛𝑆𝑖𝑧𝑒

self.embedding = torch.nn.Embedding(input_size, hidden_size)


self.gru = torch.nn.GRU(hidden_size, hidden_size, n_layers,
bidirectional=bidirectional)
self.fc = torch.nn.Linear(hidden_size * self.n_directions, output_size)

def _init_hidden(self, batch_size):


hidden = torch.zeros(self.n_layers * self.n_directions,
batch_size, self.hidden_size)
return create_tensor(hidden)

Lecturer : Hongpu Liu Lecture 13-30 PyTorch Tutorial @ SLAM Research Group
Implementation – Model Design

class RNNClassifier(torch.nn.Module): The inputs of GRU Layer with shape:


def __init__(self, input_size, hidden_size, output_size, n_layers=1, bidirectional=True):
𝑖𝑛𝑝𝑢𝑡: 𝑠𝑒𝑞𝐿𝑒𝑛, 𝑏𝑎𝑡𝑐ℎ𝑆𝑖𝑧𝑒, ℎ𝑖𝑑𝑑𝑒𝑛𝑆𝑖𝑧𝑒
super(RNNClassifier, self).__init__()
self.hidden_size = hidden_size ℎ𝑖𝑑𝑑𝑒𝑛: 𝑛𝐿𝑎𝑦𝑒𝑟𝑠 ∗ 𝑛𝐷𝑖𝑟𝑒𝑐𝑡𝑖𝑜𝑛𝑠, 𝑏𝑎𝑡𝑐ℎ𝑆𝑖𝑧𝑒, ℎ𝑖𝑑𝑑𝑒𝑛𝑆𝑖𝑧𝑒
self.n_layers = n_layers
The outputs
self.n_directions = 2 if bidirectional else of
1 GRU Layer with shape:
𝑜𝑢𝑡𝑝𝑢𝑡: 𝑠𝑒𝑞𝐿𝑒𝑛, 𝑏𝑎𝑡𝑐ℎ𝑆𝑖𝑧𝑒, ℎ𝑖𝑑𝑑𝑒𝑛𝑆𝑖𝑧𝑒 ∗ 𝑛𝐷𝑖𝑟𝑒𝑐𝑡𝑖𝑜𝑛𝑠
self.embedding = torch.nn.Embedding(input_size, hidden_size)
ℎ𝑖𝑑𝑑𝑒𝑛: hidden_size,
self.gru = torch.nn.GRU(hidden_size, 𝑛𝐿𝑎𝑦𝑒𝑟𝑠 ∗ 𝑛𝐷𝑖𝑟𝑒𝑐𝑡𝑖𝑜𝑛𝑠,
n_layers,𝑏𝑎𝑡𝑐ℎ𝑆𝑖𝑧𝑒, ℎ𝑖𝑑𝑑𝑒𝑛𝑆𝑖𝑧𝑒
bidirectional=bidirectional)
self.fc = torch.nn.Linear(hidden_size * self.n_directions, output_size)

def _init_hidden(self, batch_size):


hidden = torch.zeros(self.n_layers * self.n_directions,
batch_size, self.hidden_size)
return create_tensor(hidden)

Lecturer : Hongpu Liu Lecture 13-31 PyTorch Tutorial @ SLAM Research Group
Implementation – Model Design

class RNNClassifier(torch.nn.Module):
def __init__(self, input_size, hidden_size, output_size, n_layers=1, bidirectional=True):
super(RNNClassifier, self).__init__()
self.hidden_size = hidden_size
What is the Bi-Direction RNN/LSTM/GRU?
self.n_layers = n_layers
self.n_directions = 2 if bidirectional else 1

self.embedding = torch.nn.Embedding(input_size, hidden_size)


self.gru = torch.nn.GRU(hidden_size, hidden_size, n_layers,
bidirectional=bidirectional)
self.fc = torch.nn.Linear(hidden_size * self.n_directions, output_size)

def _init_hidden(self, batch_size):


hidden = torch.zeros(self.n_layers * self.n_directions,
batch_size, self.hidden_size)
return create_tensor(hidden)

Lecturer : Hongpu Liu Lecture 13-32 PyTorch Tutorial @ SLAM Research Group
Implementation – Bi-direction RNN/LSTM/GRU

𝒇
𝒉𝟎 RNN Cell

𝒙𝟏

Lecturer : Hongpu Liu Lecture 13-33 PyTorch Tutorial @ SLAM Research Group
Implementation – Bi-direction RNN/LSTM/GRU

𝒇
𝒉𝟎 RNN Cell RNN Cell

𝒙𝟏 𝒙𝟐

Lecturer : Hongpu Liu Lecture 13-34 PyTorch Tutorial @ SLAM Research Group
Implementation – Bi-direction RNN/LSTM/GRU

𝒇
𝒉𝟎 RNN Cell RNN Cell RNN Cell

𝒙𝟏 𝒙𝟐 𝒙𝑵−𝟏

Lecturer : Hongpu Liu Lecture 13-35 PyTorch Tutorial @ SLAM Research Group
Implementation – Bi-direction RNN/LSTM/GRU

𝒇 𝒇
𝒉𝟎 RNN Cell RNN Cell RNN Cell RNN Cell 𝒉𝑵

Forward

𝒙𝟏 𝒙𝟐 𝒙𝑵−𝟏 𝒙𝑵

Lecturer : Hongpu Liu Lecture 13-36 PyTorch Tutorial @ SLAM Research Group
Implementation – Bi-direction RNN/LSTM/GRU

𝒉𝒃𝟎

𝒇 𝒇
𝒉𝟎 RNN Cell RNN Cell RNN Cell RNN Cell 𝒉𝑵

𝒙𝟏 𝒙𝟐 𝒙𝑵−𝟏 𝒙𝑵

Lecturer : Hongpu Liu Lecture 13-37 PyTorch Tutorial @ SLAM Research Group
Implementation – Bi-direction RNN/LSTM/GRU

RNN Cell 𝒉𝒃𝟎

𝒇 𝒇
𝒉𝟎 RNN Cell RNN Cell RNN Cell RNN Cell 𝒉𝑵

𝒙𝟏 𝒙𝟐 𝒙𝑵−𝟏 𝒙𝑵

Lecturer : Hongpu Liu Lecture 13-38 PyTorch Tutorial @ SLAM Research Group
Implementation – Bi-direction RNN/LSTM/GRU

𝒉𝑵

Concat

RNN Cell 𝒉𝒃𝟎

𝒇 𝒇
𝒉𝟎 RNN Cell RNN Cell RNN Cell RNN Cell 𝒉𝑵

𝒙𝟏 𝒙𝟐 𝒙𝑵−𝟏 𝒙𝑵

Lecturer : Hongpu Liu Lecture 13-39 PyTorch Tutorial @ SLAM Research Group
Implementation – Bi-direction RNN/LSTM/GRU

𝒉𝑵

Concat

RNN Cell RNN Cell 𝒉𝒃𝟎

𝒇 𝒇
𝒉𝟎 RNN Cell RNN Cell RNN Cell RNN Cell 𝒉𝑵

𝒙𝟏 𝒙𝟐 𝒙𝑵−𝟏 𝒙𝑵

Lecturer : Hongpu Liu Lecture 13-40 PyTorch Tutorial @ SLAM Research Group
Implementation – Bi-direction RNN/LSTM/GRU

𝒉𝑵−𝟏 𝒉𝑵

Concat Concat

RNN Cell RNN Cell 𝒉𝒃𝟎

𝒇 𝒇
𝒉𝟎 RNN Cell RNN Cell RNN Cell RNN Cell 𝒉𝑵

𝒙𝟏 𝒙𝟐 𝒙𝑵−𝟏 𝒙𝑵

Lecturer : Hongpu Liu Lecture 13-41 PyTorch Tutorial @ SLAM Research Group
Implementation – Bi-direction RNN/LSTM/GRU

𝒉𝑵−𝟏 𝒉𝑵

Concat Concat

RNN Cell RNN Cell 𝒉𝒃𝟎

𝒇 𝒇
𝒉𝟎 RNN Cell RNN Cell RNN Cell RNN Cell 𝒉𝑵

𝒙𝟏 𝒙𝟐 𝒙𝑵−𝟏 𝒙𝑵

Lecturer : Hongpu Liu Lecture 13-42 PyTorch Tutorial @ SLAM Research Group
Implementation – Bi-direction RNN/LSTM/GRU

𝒉𝑵−𝟏 𝒉𝑵

Concat Concat

RNN Cell RNN Cell RNN Cell 𝒉𝒃𝟎

𝒇 𝒇
𝒉𝟎 RNN Cell RNN Cell RNN Cell RNN Cell 𝒉𝑵

𝒙𝟏 𝒙𝟐 𝒙𝑵−𝟏 𝒙𝑵

Lecturer : Hongpu Liu Lecture 13-43 PyTorch Tutorial @ SLAM Research Group
Implementation – Bi-direction RNN/LSTM/GRU

𝒉𝟏 𝒉𝑵−𝟏 𝒉𝑵

Concat Concat Concat

RNN Cell RNN Cell RNN Cell 𝒉𝒃𝟎

𝒇 𝒇
𝒉𝟎 RNN Cell RNN Cell RNN Cell RNN Cell 𝒉𝑵

𝒙𝟏 𝒙𝟐 𝒙𝑵−𝟏 𝒙𝑵

Lecturer : Hongpu Liu Lecture 13-44 PyTorch Tutorial @ SLAM Research Group
Implementation – Bi-direction RNN/LSTM/GRU

𝒉𝟏 𝒉𝑵−𝟏 𝒉𝑵

Concat Concat Concat

RNN Cell RNN Cell RNN Cell RNN Cell 𝒉𝒃𝟎

𝒇 𝒇
𝒉𝟎 RNN Cell RNN Cell RNN Cell RNN Cell 𝒉𝑵

𝒙𝟏 𝒙𝟐 𝒙𝑵−𝟏 𝒙𝑵

Lecturer : Hongpu Liu Lecture 13-45 PyTorch Tutorial @ SLAM Research Group
Implementation – Bi-direction RNN/LSTM/GRU

𝒉𝟎 𝒉𝟏 𝒉𝑵−𝟏 𝒉𝑵

Concat Concat Concat Concat

RNN Cell RNN Cell RNN Cell RNN Cell 𝒉𝒃𝟎

𝒇 𝒇
𝒉𝟎 RNN Cell RNN Cell RNN Cell RNN Cell 𝒉𝑵

𝒙𝟏 𝒙𝟐 𝒙𝑵−𝟏 𝒙𝑵

Lecturer : Hongpu Liu Lecture 13-46 PyTorch Tutorial @ SLAM Research Group
Implementation – Bi-direction RNN/LSTM/GRU

𝒉𝟎 𝒉𝟏 𝒉𝑵−𝟏 𝒉𝑵

Concat Concat Concat Concat

Backward

𝒉𝒃𝑵 RNN Cell RNN Cell RNN Cell RNN Cell 𝒉𝒃𝟎

𝒇 𝒇
𝒉𝟎 RNN Cell RNN Cell RNN Cell RNN Cell 𝒉𝑵

𝒙𝟏 𝒙𝟐 𝒙𝑵−𝟏 𝒙𝑵

Lecturer : Hongpu Liu Lecture 13-47 PyTorch Tutorial @ SLAM Research Group
Implementation – Bi-direction RNN/LSTM/GRU

output 𝒉𝟎 𝒉𝟏 𝒉𝑵−𝟏 𝒉𝑵

Concat Concat Concat Concat

𝒉𝒃𝑵 RNN Cell RNN Cell RNN Cell RNN Cell 𝒉𝒃𝟎

𝒇 𝒇
𝒉𝟎 RNN Cell RNN Cell RNN Cell RNN Cell 𝒉𝑵

𝒙𝟏 𝒙𝟐 𝒙𝑵−𝟏 𝒙𝑵

Lecturer : Hongpu Liu Lecture 13-48 PyTorch Tutorial @ SLAM Research Group
Implementation – Bi-direction RNN/LSTM/GRU

𝒉𝟎 𝒉𝟏 𝒉𝑵−𝟏 𝒉𝑵

Concat Concat Concat Concat

𝒇 𝒃
𝒉𝒃𝑵 RNN Cell
𝒉𝒊𝒅𝒅𝒆𝒏 =
RNN Cell
𝒉𝑵 , 𝒉𝑵
RNN Cell RNN Cell 𝒉𝒃𝟎

𝒇 𝒇
𝒉𝟎 RNN Cell RNN Cell RNN Cell RNN Cell 𝒉𝑵

𝒙𝟏 𝒙𝟐 𝒙𝑵−𝟏 𝒙𝑵

Lecturer : Hongpu Liu Lecture 13-49 PyTorch Tutorial @ SLAM Research Group
Implementation – Model Design

class RNNClassifier(torch.nn.Module): The inputs of GRU Layer with shape:


def __init__(self, input_size, hidden_size, output_size, n_layers=1, bidirectional=True):
𝑖𝑛𝑝𝑢𝑡: 𝑠𝑒𝑞𝐿𝑒𝑛, 𝑏𝑎𝑡𝑐ℎ𝑆𝑖𝑧𝑒, ℎ𝑖𝑑𝑑𝑒𝑛𝑆𝑖𝑧𝑒
super(RNNClassifier, self).__init__()
self.hidden_size = hidden_size ℎ𝑖𝑑𝑑𝑒𝑛: 𝑛𝐿𝑎𝑦𝑒𝑟𝑠 ∗ 𝑛𝐷𝑖𝑟𝑒𝑐𝑡𝑖𝑜𝑛𝑠, 𝑏𝑎𝑡𝑐ℎ𝑆𝑖𝑧𝑒, ℎ𝑖𝑑𝑑𝑒𝑛𝑆𝑖𝑧𝑒
self.n_layers = n_layers
The outputs
self.n_directions = 2 if bidirectional else of
1 GRU Layer with shape:
𝑜𝑢𝑡𝑝𝑢𝑡: 𝑠𝑒𝑞𝐿𝑒𝑛, 𝑏𝑎𝑡𝑐ℎ𝑆𝑖𝑧𝑒, ℎ𝑖𝑑𝑑𝑒𝑛𝑆𝑖𝑧𝑒 ∗ 𝑛𝐷𝑖𝑟𝑒𝑐𝑡𝑖𝑜𝑛𝑠
self.embedding = torch.nn.Embedding(input_size, hidden_size)
ℎ𝑖𝑑𝑑𝑒𝑛: hidden_size,
self.gru = torch.nn.GRU(hidden_size, 𝑛𝐿𝑎𝑦𝑒𝑟𝑠 ∗ 𝑛𝐷𝑖𝑟𝑒𝑐𝑡𝑖𝑜𝑛𝑠,
n_layers,𝑏𝑎𝑡𝑐ℎ𝑆𝑖𝑧𝑒, ℎ𝑖𝑑𝑑𝑒𝑛𝑆𝑖𝑧𝑒
bidirectional=bidirectional)
self.fc = torch.nn.Linear(hidden_size * self.n_directions, output_size)

def _init_hidden(self, batch_size):


hidden = torch.zeros(self.n_layers * self.n_directions,
batch_size, self.hidden_size)
return create_tensor(hidden)

Lecturer : Hongpu Liu Lecture 13-50 PyTorch Tutorial @ SLAM Research Group
Implementation – Model Design

class RNNClassifier(torch.nn.Module): The inputs of GRU Layer with shape:


def __init__(self, input_size, hidden_size, output_size, n_layers=1, bidirectional=True):
𝑖𝑛𝑝𝑢𝑡: 𝑠𝑒𝑞𝐿𝑒𝑛, 𝑏𝑎𝑡𝑐ℎ𝑆𝑖𝑧𝑒, ℎ𝑖𝑑𝑑𝑒𝑛𝑆𝑖𝑧𝑒
super(RNNClassifier, self).__init__()
self.hidden_size = hidden_size ℎ𝑖𝑑𝑑𝑒𝑛: 𝑛𝐿𝑎𝑦𝑒𝑟𝑠 ∗ 𝑛𝐷𝑖𝑟𝑒𝑐𝑡𝑖𝑜𝑛𝑠, 𝑏𝑎𝑡𝑐ℎ𝑆𝑖𝑧𝑒, ℎ𝑖𝑑𝑑𝑒𝑛𝑆𝑖𝑧𝑒
self.n_layers = n_layers
The outputs
self.n_directions = 2 if bidirectional else of
1 GRU Layer with shape:
𝑜𝑢𝑡𝑝𝑢𝑡: 𝑠𝑒𝑞𝐿𝑒𝑛, 𝑏𝑎𝑡𝑐ℎ𝑆𝑖𝑧𝑒, ℎ𝑖𝑑𝑑𝑒𝑛𝑆𝑖𝑧𝑒 ∗ 𝑛𝐷𝑖𝑟𝑒𝑐𝑡𝑖𝑜𝑛𝑠
self.embedding = torch.nn.Embedding(input_size, hidden_size)
ℎ𝑖𝑑𝑑𝑒𝑛:hidden_size,
self.gru = torch.nn.GRU(hidden_size, 𝑛𝐿𝑎𝑦𝑒𝑟𝑠 ∗ 𝒏𝑫𝒊𝒓𝒆𝒄𝒕𝒊𝒐𝒏𝒔,
n_layers, 𝑏𝑎𝑡𝑐ℎ𝑆𝑖𝑧𝑒, 𝒉𝒊𝒅𝒅𝒆𝒏𝑺𝒊𝒛𝒆
bidirectional=bidirectional)
self.fc = torch.nn.Linear(hidden_size * self.n_directions, output_size)

def _init_hidden(self, batch_size):


hidden = torch.zeros(self.n_layers * self.n_directions,
batch_size, self.hidden_size)
return create_tensor(hidden)

Lecturer : Hongpu Liu Lecture 13-51 PyTorch Tutorial @ SLAM Research Group
Implementation – Model Design
class RNNClassifier(torch.nn.Module):
def forward(self, input, seq_lengths):
# input shape : B x S -> S x B
input = input.t()
batch_size = input.size(1)

hidden = self._init_hidden(batch_size)
embedding = self.embedding(input)

# pack them up
gru_input = pack_padded_sequence(embedding, seq_lengths)

output, hidden = self.gru(gru_input, hidden)


if self.n_directions == 2:
hidden_cat = torch.cat([hidden[-1], hidden[-2]], dim=1)
else:
hidden_cat = hidden[-1]
fc_output = self.fc(hidden_cat)
return fc_output

Lecturer : Hongpu Liu Lecture 13-52 PyTorch Tutorial @ SLAM Research Group
Implementation – Model Design
class RNNClassifier(torch.nn.Module):
def forward(self, input, seq_lengths):
# input shape : B x S -> S x B
input = input.t()
batch_size = input.size(1)

hidden = self._init_hidden(batch_size)
embedding = self.embedding(input)

# pack them up
gru_input = pack_padded_sequence(embedding, seq_lengths)

output, hidden = self.gru(gru_input, hidden)


if self.n_directions == 2:
hidden_cat = torch.cat([hidden[-1], hidden[-2]], dim=1)
else:
hidden_cat = hidden[-1]
fc_output = self.fc(hidden_cat)
return fc_output

Lecturer : Hongpu Liu Lecture 13-53 PyTorch Tutorial @ SLAM Research Group
Implementation – Model Design
class RNNClassifier(torch.nn.Module):
def forward(self, input, seq_lengths):
# input shape : B x S -> S x B
input = input.t()
batch_size = input.size(1)

hidden = self._init_hidden(batch_size)
After padding After transpose

[ 77 97 99 embedding
108 101 97 110 = 0self.embedding(input)
0 0] 77 86 78 85 70 83 66 80 84

[ 86 97 106 110 105 99 104 121 0 0] 97 97 97 115 105 104 97 97 97

[ 78 # pack them up
97 115 105 107 111 118 115 107 121]
99 106 115 97 111 97 108 107 110

[ 85 115
gru_input
97 109 105 0 0
= 0pack_padded_sequence(embedding,
0 0]
seq_lengths)
108 110 105 109 110 114 97 104 115
101 105 107 105 105 107 103 114 104
[ 70 105 111 110 105 110 0 0 0 0]

[ 83
output, hidden
104 97 114 107 101 121 0 0
= self.gru(gru_input,
0]
hidden)
97 99 111 0 110 101 117 105 111

if self.n_directions == 2: 110 104 118 0 0 121 108 110 0


[ 66 97 108 97 103 117 108 0 0 0]
hidden_cat = torch.cat([hidden[-1], 0
hidden[-2]],
101 115 0 0
dim=1)
0 0 0 0
[ 80 97 107 104 114 105 110 0 0 0] 0 0 107 0 0 0 0 0 0
else:
[ 84 97 110 115 104 111 0 0 0 0] 0 0 121 0 0 0 0 0 0
hidden_cat = hidden[-1]
𝒃𝒂𝒕𝒄𝒉𝑺𝒊𝒛𝒆,
fc_output 𝒔𝒆𝒒𝑳𝒆𝒏
= self.fc(hidden_cat) 𝒔𝒆𝒒𝑳𝒆𝒏, 𝒃𝒂𝒕𝒄𝒉𝑺𝒊𝒛𝒆
return fc_output

Lecturer : Hongpu Liu Lecture 13-54 PyTorch Tutorial @ SLAM Research Group
Implementation – Model Design
class RNNClassifier(torch.nn.Module):
def forward(self, input, seq_lengths):
# input shape : B x S -> S x B
input = input.t()
batch_size = input.size(1) Save batch-size for make initial hidden.

hidden = self._init_hidden(batch_size)
embedding = self.embedding(input)

# pack them up
gru_input = pack_padded_sequence(embedding, seq_lengths)

output, hidden = self.gru(gru_input, hidden)


if self.n_directions == 2:
hidden_cat = torch.cat([hidden[-1], hidden[-2]], dim=1)
else:
hidden_cat = hidden[-1]
fc_output = self.fc(hidden_cat)
return fc_output

Lecturer : Hongpu Liu Lecture 13-55 PyTorch Tutorial @ SLAM Research Group
Implementation – Model Design
class RNNClassifier(torch.nn.Module):
def forward(self, input, seq_lengths):
# input shape : B x S -> S x B
Initial hidden with shape:
input = input.t()
batch_size = input.size(1) 𝑛𝐿𝑎𝑦𝑒𝑟 ∗ 𝑛𝐷𝑖𝑟𝑒𝑐𝑡𝑖𝑜𝑛𝑠, 𝑏𝑎𝑡𝑐ℎ𝑆𝑖𝑧𝑒, ℎ𝑖𝑑𝑑𝑒𝑛𝑆𝑖𝑧𝑒

hidden = self._init_hidden(batch_size)
embedding = self.embedding(input)

# pack them up
gru_input = pack_padded_sequence(embedding, seq_lengths)

output, hidden = self.gru(gru_input, hidden)


if self.n_directions == 2:
hidden_cat = torch.cat([hidden[-1], hidden[-2]], dim=1)
else:
hidden_cat = hidden[-1]
fc_output = self.fc(hidden_cat)
return fc_output

Lecturer : Hongpu Liu Lecture 13-56 PyTorch Tutorial @ SLAM Research Group
Implementation – Model Design
class RNNClassifier(torch.nn.Module):
def forward(self, input, seq_lengths):
# input shape : B x S -> S x B
input = input.t() Result of embedding with shape:
batch_size = input.size(1) 𝑠𝑒𝑞𝐿𝑒𝑛, 𝑏𝑎𝑡𝑐ℎ𝑆𝑖𝑧𝑒, ℎ𝑖𝑑𝑑𝑒𝑛𝑆𝑖𝑧𝑒

hidden = self._init_hidden(batch_size)
embedding = self.embedding(input)

# pack them up
gru_input = pack_padded_sequence(embedding, seq_lengths)

output, hidden = self.gru(gru_input, hidden)


if self.n_directions == 2:
hidden_cat = torch.cat([hidden[-1], hidden[-2]], dim=1)
else:
hidden_cat = hidden[-1]
fc_output = self.fc(hidden_cat)
return fc_output

Lecturer : Hongpu Liu Lecture 13-57 PyTorch Tutorial @ SLAM Research Group
Implementation – Model Design

77 86 78 85 70 83 66 80 84 0.78 0.41 0.21 0.20 0.88 0.77 0.07 0.22 0.71


97 97 97 115 105 104 97 97 97 0.95 0.37 0.43 0.32 0.98 0.61 0.27 0.60 0.23
99 106 115 97 111 97 108 107 110 0.47 0.72 0.56 0.91 1.00 0.43 0.75 0.09 0.41
108 110 105 109 110 114 97 104 115 0.43 0.10 0.10 0.78 0.73 0.39 0.75 0.42 0.76
101 105 107 105 105 107 103 114 104 0.70 0.29 0.48 0.42 0.43 0.43 0.81 0.98 0.57
97 99 111 0 110 101 117 105 111
𝒔𝒆𝒒𝑳𝒆𝒏 0.87 0.55 0.01 0.60 0.75 0.30 0.95 0.92 0.19
110 104 118 0 0 121 108 110 0 0.26 0.41 0.60 0.27 0.10 0.95 0.28 0.76 0.11
0 101 115 0 0 0 0 0 0 0.42 0.57 0.69 0.52 0.55 0.19 0.10 0.36 0.11
0 0 107 0 0 0 0 0 0 0.19 0.07 0.35 0.44 0.21 0.20 0.02 0.72 0.38
0 0 121 0 0 0 0 0 0 Embedding 0.16 0.57 0.92 0.54 0.91 0.50 0.82 0.49 0.27

𝒃𝒂𝒕𝒄𝒉𝑺𝒊𝒛𝒆

Lecturer : Hongpu Liu Lecture 13-58 PyTorch Tutorial @ SLAM Research Group
Implementation – Model Design

77 86 78 85 70 83 66 80 84 0.78 0.41 0.21 0.20 0.88 0.77 0.07 0.22 0.71


97 97 97 115 105 104 97 97 97 0.95 0.37 0.43 0.32 0.98 0.61 0.27 0.60 0.23
99 106 115 97 111 97 108 107 110 0.47 0.72 0.56 0.91 1.00 0.43 0.75 0.09 0.41
108 110 105 109 110 114 97 104 115 0.43 0.10 0.10 0.78 0.73 0.39 0.75 0.42 0.76
101 105 107 105 105 107 103 114 104 0.70 0.29 0.48 0.42 0.43 0.43 0.81 0.98 0.57
97 99 111 0 110 101 117 105 111
𝒔𝒆𝒒𝑳𝒆𝒏 0.87 0.55 0.01 0.60 0.75 0.30 0.95 0.92 0.19
110 104 118 0 0 121 108 110 0 0.26 0.41 0.60 0.27 0.10 0.95 0.28 0.76 0.11
0 101 115 0 0 0 0 0 0 0.42 0.57 0.69 0.52 0.55 0.19 0.10 0.36 0.11
0 0 107 0 0 0 0 0 0 0.19 0.07 0.35 0.44 0.21 0.20 0.02 0.72 0.38
0 0 121 0 0 0 0 0 0 Embedding 0.16 0.57 0.92 0.54 0.91 0.50 0.82 0.49 0.27

𝒃𝒂𝒕𝒄𝒉𝑺𝒊𝒛𝒆

Lecturer : Hongpu Liu Lecture 13-59 PyTorch Tutorial @ SLAM Research Group
Implementation – Model Design
class RNNClassifier(torch.nn.Module):
The first parameter with shape:
def forward(self, input, seq_lengths):
# input shape : B x S -> S x B 𝑠𝑒𝑞𝐿𝑒𝑛, 𝑏𝑎𝑡𝑐ℎ𝑆𝑖𝑧𝑒, ℎ𝑖𝑑𝑑𝑒𝑛𝑆𝑖𝑧𝑒
input = input.t() The second parameter is a tensor, which is a list of
batch_size = input.size(1)
sequence length of each batch element.
hidden = self._init_hidden(batch_size)
embedding = self.embedding(input)

# pack them up
gru_input = pack_padded_sequence(embedding, seq_lengths)

output, hidden = self.gru(gru_input, hidden)


if self.n_directions == 2:
hidden_cat = torch.cat([hidden[-1], hidden[-2]], dim=1)
else:
hidden_cat = hidden[-1]
fc_output = self.fc(hidden_cat)
return fc_output

Lecturer : Hongpu Liu Lecture 13-60 PyTorch Tutorial @ SLAM Research Group
Implementation – Model Design
class RNNClassifier(torch.nn.Module):
def forward(self, input, seq_lengths):
# input shape : B x S -> S x B
input = input.t()
batch_size = input.size(1)

hidden = self._init_hidden(batch_size)
Result of embedding with shape:
embedding = self.embedding(input)
𝑠𝑒𝑞𝐿𝑒𝑛, 𝑏𝑎𝑡𝑐ℎ𝑆𝑖𝑧𝑒, ℎ𝑖𝑑𝑑𝑒𝑛𝑆𝑖𝑧𝑒
# pack them up
gru_input = pack_padded_sequence(embedding, seq_lengths)

output, hidden = self.gru(gru_input, hidden)


if self.n_directions == 2:
hidden_cat = torch.cat([hidden[-1], hidden[-2]], dim=1)
else:
hidden_cat = hidden[-1]
fc_output = self.fc(hidden_cat)
return fc_output

Lecturer : Hongpu Liu Lecture 13-61 PyTorch Tutorial @ SLAM Research Group
Implementation – Model Design
class RNNClassifier(torch.nn.Module):
The first parameter with shape:
def forward(self, input, seq_lengths):
# input shape : B x S -> S x B 𝑠𝑒𝑞𝐿𝑒𝑛, 𝑏𝑎𝑡𝑐ℎ𝑆𝑖𝑧𝑒, ℎ𝑖𝑑𝑑𝑒𝑛𝑆𝑖𝑧𝑒
input = input.t()
The second parameter is a tensor, which is a list of
batch_size = input.size(1)
sequence length of each batch element.
hidden = self._init_hidden(batch_size)
embedding = self.embedding(input)
It returns a PackedSquence object.
# pack them up
gru_input = pack_padded_sequence(embedding, seq_lengths)

output, hidden = self.gru(gru_input, hidden)


if self.n_directions == 2:
hidden_cat = torch.cat([hidden[-1], hidden[-2]], dim=1)
else:
hidden_cat = hidden[-1]
fc_output = self.fc(hidden_cat)
return fc_output

Lecturer : Hongpu Liu Lecture 13-62 PyTorch Tutorial @ SLAM Research Group
Implementation – Model Design

Lecturer : Hongpu Liu Lecture 13-63 PyTorch Tutorial @ SLAM Research Group
Implementation – Model Design

0.78 0.41 0.21 0.20 0.88 0.77 0.07 0.22 0.71


0.95 0.37 0.43 0.32 0.98 0.61 0.27 0.60 0.23
0.47 0.72 0.56 0.91 1.00 0.43 0.75 0.09 0.41
0.43 0.10 0.10 0.78 0.73 0.39 0.75 0.42 0.76
0.70 0.29 0.48 0.42 0.43 0.43 0.81 0.98 0.57
0.87 0.55 0.01 0.60 0.75 0.30 0.95 0.92 0.19
0.26 0.41 0.60 0.27 0.10 0.95 0.28 0.76 0.11
0.42 0.57 0.69 0.52 0.55 0.19 0.10 0.36 0.11
0.19 0.07 0.35 0.44 0.21 0.20 0.02 0.72 0.38
0.16 0.57 0.92 0.54 0.91 0.50 0.82 0.49 0.27

It cannot work!

7 8 10 5 6 7 7 7 6
Must be sorted by descendent

Lecturer : Hongpu Liu Lecture 13-64 PyTorch Tutorial @ SLAM Research Group
Implementation – Model Design

77 86 78 85 70 83 66 80 84 78 86 77 83 66 80 70 84 85
97 97 97 115 105 104 97 97 97 97 97 97 104 97 97 105 97 115
99 106 115 97 111 97 108 107 110 115 106 99 97 108 107 111 110 97
108 110 105 109 110 114 97 104 115 105 110 108 114 97 104 110 115 109
101 105 107 105 105 107 103 114 104 107 105 101 107 103 114 105 104 105
97 99 111 0 110 101 117 105 111 111 99 97 101 117 105 110 111 0
110 104 118 0 0 121 108 110 0 118 104 110 121 108 110 0 0 0
0 101 115 0 0 0 0 0 0 115 101 0 0 0 0 0 0 0
0 0 107 0 0 0 0 0 0 107 0 0 0 0 0 0 0 0
0 0 121 0 0 0 0 0 0 121 0 0 0 0 0 0 0 0

We have to sort the batch element by length of sequence.

Lecturer : Hongpu Liu Lecture 13-65 PyTorch Tutorial @ SLAM Research Group
Implementation – Model Design

78 86 77 83 66 80 70 84 85 78 86 77 83 66 80 70 84 85
97 97 97 104 97 97 105 97 115 97 97 97 104 97 97 105 97 115
115 106 99 97 108 107 111 110 97 115 106 99 97 108 107 111 110 97
105 110 108 114 97 104 110 115 109 105 110 108 114 97 104 110 115 109
107 105 101 107 103 114 105 104 105 107 105 101 107 103 114 105 104 105
111 99 97 101 117 105 110 111 0
𝒔𝒆𝒒𝑳𝒆𝒏 111 99 97 101 117 105 110 111 0
118 104 110 121 108 110 0 0 0 118 104 110 121 108 110 0 0 0
115 101 0 0 0 0 0 0 0 115 101 0 0 0 0 0 0 0
107 0 0 0 0 0 0 0 0 107 0 0 0 0 0 0 0 0
121 0 0 0 0 0 0 0 0 Embedding 121 0 0 0 0 0 0 0 0

𝒃𝒂𝒕𝒄𝒉𝑺𝒊𝒛𝒆

Lecturer : Hongpu Liu Lecture 13-66 PyTorch Tutorial @ SLAM Research Group
Implementation – Model Design

PackedSequence
78 86 77 83 66 80 70 84 85
97 97 97 104 97 97 105 97 115 gru_input
115 106 99 97 108 107 111 110 97
105 110 108 114 97 104 110 115 109
107 105 101 107 103 114 105 104 105
111 99 97 101 117 105 110 111 0
118 104 110 121 108 110 0 0 0 10 8 7 7 7 7 6 6 5
115 101 0 0 0 0 0 0 0 𝒃𝒂𝒕𝒄𝒉_𝒔𝒊𝒛𝒆𝒔
107 0 0 0 0 0 0 0 0
121 0 0 0 0 0 0 0 0

𝒆𝒎𝒃𝒆𝒅𝒅𝒊𝒏𝒈

10 8 7 7 7 7 6 6 5 𝒅𝒂𝒕𝒂
𝒔𝒆𝒒_𝒍𝒆𝒏𝒈𝒕𝒉

Lecturer : Hongpu Liu Lecture 13-67 PyTorch Tutorial @ SLAM Research Group
Implementation – Model Design
class RNNClassifier(torch.nn.Module):
The first parameter with shape:
def forward(self, input, seq_lengths):
# input shape : B x S -> S x B 𝑠𝑒𝑞𝐿𝑒𝑛, 𝑏𝑎𝑡𝑐ℎ𝑆𝑖𝑧𝑒, ℎ𝑖𝑑𝑑𝑒𝑛𝑆𝑖𝑧𝑒
input = input.t() The second parameter is a tensor, which is a list of
batch_size = input.size(1)
sequence length of each batch element.
hidden = self._init_hidden(batch_size)
embedding = self.embedding(input)
It returns a PackedSquence object.
# pack them up
gru_input = pack_padded_sequence(embedding, seq_lengths)

output, hidden = self.gru(gru_input, hidden)


if self.n_directions == 2:
hidden_cat = torch.cat([hidden[-1], hidden[-2]], dim=1)
else:
hidden_cat = hidden[-1]
fc_output = self.fc(hidden_cat)
return fc_output

Lecturer : Hongpu Liu Lecture 13-68 PyTorch Tutorial @ SLAM Research Group
Implementation – Model Design
class RNNClassifier(torch.nn.Module):
The output is a PackedSequence object, actually it is a tuple.
def forward(self, input, seq_lengths):
# input shape : B x S -> S x B
input = the shape of hidden, which we concerned, with shape:
input.t()
batch_size = input.size(1)
𝑛𝐿𝑎𝑦𝑒𝑟𝑠 ∗ 𝑛𝐷𝑖𝑟𝑒𝑐𝑡𝑖𝑜𝑛, 𝑏𝑎𝑡𝑐ℎ𝑆𝑖𝑧𝑒, ℎ𝑖𝑑𝑑𝑒𝑛𝑆𝑖𝑧𝑒
hidden = self._init_hidden(batch_size)
embedding = self.embedding(input)

# pack them up
gru_input = pack_padded_sequence(embedding, seq_lengths)

output, hidden = self.gru(gru_input, hidden)


if self.n_directions == 2:
hidden_cat = torch.cat([hidden[-1], hidden[-2]], dim=1)
else:
hidden_cat = hidden[-1]
fc_output = self.fc(hidden_cat)
return fc_output

Lecturer : Hongpu Liu Lecture 13-69 PyTorch Tutorial @ SLAM Research Group
Implementation – Model Design

Linear
Layer

ℎ0 RNN Cell RNN Cell RNN Cell RNN Cell ℎ𝑁

Embed Embed Embed Embed

𝑥1 𝑥2 𝑥3 𝑥𝑁

Lecturer : Hongpu Liu Lecture 13-70 PyTorch Tutorial @ SLAM Research Group
Implementation – Model Design

𝒉𝟎 𝒉𝟏 𝒉𝑵−𝟏 𝒉𝑵

Concat Concat Concat Concat

𝒇 𝒃
𝒉𝒃𝑵 RNN Cell
𝒉𝒊𝒅𝒅𝒆𝒏 =
RNN Cell
𝒉𝑵 , 𝒉𝑵
RNN Cell RNN Cell 𝒉𝒃𝟎

𝒇 𝒇
𝒉𝟎 RNN Cell RNN Cell RNN Cell RNN Cell 𝒉𝑵

𝒙𝟏 𝒙𝟐 𝒙𝑵−𝟏 𝒙𝑵

Lecturer : Hongpu Liu Lecture 13-71 PyTorch Tutorial @ SLAM Research Group
Implementation – Model Design
class RNNClassifier(torch.nn.Module):
def forward(self, input, seq_lengths):
# input shape : B x S -> S x B
input = input.t()
batch_size = input.size(1)

hidden = self._init_hidden(batch_size)
embedding = self.embedding(input)

If weupuse bidirectional GRU, the forward hidden and backward


# pack them
gru_input = pack_padded_sequence(embedding,
hidden should be concatenate. seq_lengths)

output, hidden = self.gru(gru_input, hidden)


if self.n_directions == 2:
hidden_cat = torch.cat([hidden[-1], hidden[-2]], dim=1)
else:
hidden_cat = hidden[-1]
fc_output = self.fc(hidden_cat)
return fc_output

Lecturer : Hongpu Liu Lecture 13-72 PyTorch Tutorial @ SLAM Research Group
Implementation – Model Design
class RNNClassifier(torch.nn.Module):
def forward(self, input, seq_lengths):
# input shape : B x S -> S x B
input = input.t()
batch_size = input.size(1)

hidden = self._init_hidden(batch_size)
embedding = self.embedding(input)

# pack them up
gru_input = pack_padded_sequence(embedding, seq_lengths)

output, hidden = self.gru(gru_input, hidden)


if self.n_directions == 2:
hidden_cat = torch.cat([hidden[-1], hidden[-2]], dim=1)
else:
hidden_cat = hidden[-1]
fc_output = self.fc(hidden_cat) Use linear classifier.
return fc_output

Lecturer : Hongpu Liu Lecture 13-73 PyTorch Tutorial @ SLAM Research Group
Implementation – Model Design
class RNNClassifier(torch.nn.Module):
def forward(self, input, seq_lengths):
# input shape : B x S -> S x B
input = input.t()
batch_size = input.size(1)

hidden = self._init_hidden(batch_size)
embedding = self.embedding(input)

# pack them up
gru_input = pack_padded_sequence(embedding, seq_lengths)

output, hidden = self.gru(gru_input, hidden)


if self.n_directions == 2:
hidden_cat = torch.cat([hidden[-1], hidden[-2]], dim=1)
else:
hidden_cat = hidden[-1]
fc_output = self.fc(hidden_cat)
return fc_output

Lecturer : Hongpu Liu Lecture 13-74 PyTorch Tutorial @ SLAM Research Group
Implementation – Convert name to tensor

Name

Maclean 78 86 77 83 66 80 70 84 85
97 97 97 104 97 97 105 97 115
Vajnichy 115 106 99 97 108 107 111 110 97
105 110 108 114 97 104 110 115 109
Nasikovsky
107 105 101 107 103 114 105 104 105

Usami 111 99 97 101 117 105 110 111 0


118 104 110 121 108 110 0 0 0
Fionin 115 101 0 0 0 0 0 0 0

Sharkey 107 0 0 0 0 0 0 0 0
121 0 0 0 0 0 0 0 0
Balagul

Pakhrin
10 8 7 7 7 7 6 6 5
Tansho

Lecturer : Hongpu Liu Lecture 13-75 PyTorch Tutorial @ SLAM Research Group
Implementation – Convert name to tensor

Name Characters ASCII

Maclean ['M', 'a', 'c', 'l', 'e', 'a', 'n'] [ 77 97 99 108 101 97 110]

Vajnichy ['V', 'a', 'j', 'n', 'i', 'c', 'h', 'y'] [ 86 97 106 110 105 99 104 121]

Nasikovsky ['N', 'a', 's', 'i', 'k', 'o', 'v', 's', 'k', 'y'] [ 78 97 115 105 107 111 118 115 107 121]

Usami ['U', 's', 'a', 'm', 'i'] [ 85 115 97 109 105]

Fionin ['F', 'i', 'o', 'n', 'i', 'n'] [ 70 105 111 110 105 110]

Sharkey ['S', 'h', 'a', 'r', 'k', 'e', 'y'] [ 83 104 97 114 107 101 121]

Balagul ['B', 'a', 'l', 'a', 'g', 'u', 'l'] [ 66 97 108 97 103 117 108]

Pakhrin ['P', 'a', 'k', 'h', 'r', 'i', 'n'] [ 80 97 107 104 114 105 110]

Tansho ['T', 'a', 'n', 's', 'h', 'o'] [ 84 97 110 115 104 111]

Lecturer : Hongpu Liu Lecture 13-76 PyTorch Tutorial @ SLAM Research Group
Implementation – Convert name to tensor

ASCII After padding

[ 77 97 99 108 101 97 110] [ 77 97 99 108 101 97 110 0 0 0]

[ 86 97 106 110 105 99 104 121] [ 86 97 106 110 105 99 104 121 0 0]

[ 78 97 115 105 107 111 118 115 107 121] [ 78 97 115 105 107 111 118 115 107 121]

[ 85 115 97 109 105] [ 85 115 97 109 105 0 0 0 0 0]

[ 70 105 111 110 105 110] [ 70 105 111 110 105 110 0 0 0 0]

[ 83 104 97 114 107 101 121] [ 83 104 97 114 107 101 121 0 0 0]

[ 66 97 108 97 103 117 108] [ 66 97 108 97 103 117 108 0 0 0]

[ 80 97 107 104 114 105 110] [ 80 97 107 104 114 105 110 0 0 0]

[ 84 97 110 115 104 111] [ 84 97 110 115 104 111 0 0 0 0]

Lecturer : Hongpu Liu Lecture 13-77 PyTorch Tutorial @ SLAM Research Group
Implementation – Convert name to tensor

After padding After transpose

[ 77 97 99 108 101 97 110 0 0 0] 77 86 78 85 70 83 66 80 84

[ 86 97 106 110 105 99 104 121 0 0] 97 97 97 115 105 104 97 97 97


99 106 115 97 111 97 108 107 110
[ 78 97 115 105 107 111 118 115 107 121]
108 110 105 109 110 114 97 104 115
[ 85 115 97 109 105 0 0 0 0 0]
101 105 107 105 105 107 103 114 104
[ 70 105 111 110 105 110 0 0 0 0]
97 99 111 0 110 101 117 105 111
[ 83 104 97 114 107 101 121 0 0 0]
110 104 118 0 0 121 108 110 0
[ 66 97 108 97 103 117 108 0 0 0] 0 101 115 0 0 0 0 0 0
[ 80 97 107 104 114 105 110 0 0 0] 0 0 107 0 0 0 0 0 0
[ 84 97 110 115 104 111 0 0 0 0] 0 0 121 0 0 0 0 0 0

𝒃𝒂𝒕𝒄𝒉𝑺𝒊𝒛𝒆, 𝒔𝒆𝒒𝑳𝒆𝒏 𝒔𝒆𝒒𝑳𝒆𝒏, 𝒃𝒂𝒕𝒄𝒉𝑺𝒊𝒛𝒆

Lecturer : Hongpu Liu Lecture 13-78 PyTorch Tutorial @ SLAM Research Group
Implementation – Convert name to tensor

77 86 78 85 70 83 66 80 84 78 86 77 83 66 80 70 84 85
97 97 97 115 105 104 97 97 97 97 97 97 104 97 97 105 97 115
99 106 115 97 111 97 108 107 110 115 106 99 97 108 107 111 110 97
108 110 105 109 110 114 97 104 115 105 110 108 114 97 104 110 115 109
101 105 107 105 105 107 103 114 104 107 105 101 107 103 114 105 104 105
97 99 111 0 110 101 117 105 111 111 99 97 101 117 105 110 111 0
110 104 118 0 0 121 108 110 0 118 104 110 121 108 110 0 0 0
0 101 115 0 0 0 0 0 0 115 101 0 0 0 0 0 0 0
0 0 107 0 0 0 0 0 0 107 0 0 0 0 0 0 0 0
0 0 121 0 0 0 0 0 0 121 0 0 0 0 0 0 0 0

We have to sort the batch element by length of sequence.

Lecturer : Hongpu Liu Lecture 13-79 PyTorch Tutorial @ SLAM Research Group
Implementation – Convert name to tensor
def make_tensors(names, countries):
sequences_and_lengths = [name2list(name) for name in names]
name_sequences = [sl[0] for sl in sequences_and_lengths]
def name2list(name):
seq_lengths = torch.LongTensor([sl[1] for sl in sequences_and_lengths])
countries = countries.long() arr = [ord(c) for c in name]
return arr, len(arr)
# make tensor of name, BatchSize x SeqLen
seq_tensor = torch.zeros(len(name_sequences), seq_lengths.max()).long()
for idx, (seq, seq_len) in enumerate(zip(name_sequences, seq_lengths), 0):
seq_tensor[idx, :seq_len] = torch.LongTensor(seq)

# sort by length to use pack_padded_sequence


seq_lengths, perm_idx = seq_lengths.sort(dim=0, descending=True)
seq_tensor = seq_tensor[perm_idx]
countries = countries[perm_idx]

return create_tensor(seq_tensor), \
create_tensor(seq_lengths),\
create_tensor(countries)

Lecturer : Hongpu Liu Lecture 13-80 PyTorch Tutorial @ SLAM Research Group
Implementation – Convert name to tensor
def make_tensors(names, countries):
sequences_and_lengths = [name2list(name) for name in names]
name_sequences = [sl[0] for sl in sequences_and_lengths]
seq_lengths = torch.LongTensor([sl[1] for sl in sequences_and_lengths])
countries = countries.long() ASCII

# make tensor of name, BatchSize x[ SeqLen


77 97 99 108 101 97 110]
seq_tensor = torch.zeros(len(name_sequences),
[ 86 97 106 seq_lengths.max()).long()
110 105 99 104 121]
for idx, (seq, seq_len) in enumerate(zip(name_sequences, seq_lengths), 0):
[ 78 97 115 105 107 111 118 115 107 121]
seq_tensor[idx, :seq_len] = torch.LongTensor(seq)
[ 85 115 97 109 105]
# sort by length to use pack_padded_sequence
[ 70 105 111 110
seq_lengths, perm_idx = seq_lengths.sort(dim=0, 105 110]
descending=True)
seq_tensor = seq_tensor[perm_idx] [ 83 104 97 114 107 101 121]
countries = countries[perm_idx]
[ 66 97 108 97 103 117 108]
return create_tensor(seq_tensor), [\ 80 97 107 104 114 105 110]
create_tensor(seq_lengths),\
create_tensor(countries) [ 84 97 110 115 104 111]

Lecturer : Hongpu Liu Lecture 13-81 PyTorch Tutorial @ SLAM Research Group
Implementation – Convert name to tensor
def make_tensors(names, countries):
sequences_and_lengths = [name2list(name) for name in names]
name_sequences = [sl[0] for sl in sequences_and_lengths]
seq_lengths = torch.LongTensor([sl[1] for sl in sequences_and_lengths])
countries = countries.long()

# make tensor of name, BatchSize x SeqLen


seq_tensor = torch.zeros(len(name_sequences), seq_lengths.max()).long()
for idx, (seq, seq_len) in enumerate(zip(name_sequences, seq_lengths), 0):
seq_tensor[idx, :seq_len] = torch.LongTensor(seq)

# sort by length to use pack_padded_sequence


seq_lengths, perm_idx = seq_lengths.sort(dim=0, descending=True)
seq_tensor = seq_tensor[perm_idx]
countries = countries[perm_idx]

return create_tensor(seq_tensor), \
create_tensor(seq_lengths),\
create_tensor(countries)

Lecturer : Hongpu Liu Lecture 13-82 PyTorch Tutorial @ SLAM Research Group
Implementation – Convert name to tensor
def make_tensors(names, countries):
sequences_and_lengths = [name2list(name) for name in names]
name_sequences = [sl[0] for sl in sequences_and_lengths]
seq_lengths = torch.LongTensor([sl[1] for sl in sequences_and_lengths])
countries = countries.long()

# make tensor of name, BatchSize x SeqLen


seq_tensor = torch.zeros(len(name_sequences), seq_lengths.max()).long()
for idx, (seq, seq_len) in enumerate(zip(name_sequences, seq_lengths), 0):
seq_tensor[idx, :seq_len] = torch.LongTensor(seq)
After padding
# sort by length to use pack_padded_sequence [ 77 97 99 108 101 97 110 0 0 0]

seq_lengths, perm_idx = seq_lengths.sort(dim=0, descending=True)


[ 86 97 106 110 105 99 104 121 0 0]

seq_tensor = seq_tensor[perm_idx] [ 78 97 115 105 107 111 118 115 107 121]

countries = countries[perm_idx] [ 85 115 97 109 105 0 0 0 0 0]

[ 70 105 111 110 105 110 0 0 0 0]

[ 83 104 97 114 107 101 121 0 0 0]


return create_tensor(seq_tensor), \ [ 66 97 108 97 103 117 108 0 0 0]
create_tensor(seq_lengths),\ [ 80 97 107 104 114 105 110 0 0 0]
create_tensor(countries) [ 84 97 110 115 104 111 0 0 0 0]

Lecturer : Hongpu Liu Lecture 13-83 PyTorch Tutorial @ SLAM Research Group
Implementation – Convert name to tensor
def make_tensors(names, countries):
sequences_and_lengths = [name2list(name) for name in names]
name_sequences = [sl[0] for sl in sequences_and_lengths]
seq_lengths = torch.LongTensor([sl[1] for sl in sequences_and_lengths])
countries = countries.long()

# make tensor of name, BatchSize x SeqLen


seq_tensor = torch.zeros(len(name_sequences), seq_lengths.max()).long()
for idx, (seq, seq_len) in enumerate(zip(name_sequences, seq_lengths), 0):
seq_tensor[idx, :seq_len] = torch.LongTensor(seq)

# sort by length to use pack_padded_sequence


seq_lengths, perm_idx = seq_lengths.sort(dim=0, descending=True)
seq_tensor = seq_tensor[perm_idx]
countries = countries[perm_idx]

return create_tensor(seq_tensor), \
create_tensor(seq_lengths),\
create_tensor(countries)

Lecturer : Hongpu Liu Lecture 13-84 PyTorch Tutorial @ SLAM Research Group
Implementation – Convert name to tensor
After padding After padding
def make_tensors(names, countries):
[ 77 97 99 108 101 97 110 0 0 0] [ 78 97 115 105 107 111 118 115 107 121]
sequences_and_lengths = [name2list(name) for name in names]
[ 86 97 106 110 105 99 104 121 0 0] [ 86 97 106 110 105 99 104 121 0 0]
name_sequences = [sl[0] for sl in sequences_and_lengths]
seq_lengths = torch.LongTensor([sl[1] for sl in
[ 78 97 115 105 107 111 118 115 107 121]
sequences_and_lengths])
[ 83 104 97 114 107 101 121 0 0 0]

countries0 =0 countries.long()
[ 85 115 97 109 105 0 0 0] [ 66 97 108 97 103 117 108 0 0 0]

[ 70 105 111 110 105 110 0 0 0 0] [ 80 97 107 104 114 105 110 0 0 0]

[ 83 104
# 97make tensor 0 of0 name,
114 107 101 121 0]
BatchSize x SeqLen [ 77 97 99 108 101 97 110 0 0 0]

[ 66
seq_tensor
97 108
= torch.zeros(len(name_sequences),
97 103 117 108 0 0 0]
seq_lengths.max()).long()
[ 70 105 111 110 105 110 0 0 0 0]

[ 80
for idx, (seq,0 seq_len)
97 107 104 114 105 110 0 0]
in enumerate(zip(name_sequences, seq_lengths),
[ 84 97 110 115 104 111 0 0 0 0]
0):
[ 84
seq_tensor[idx, :seq_len] = torch.LongTensor(seq)
97 110 115 104 111 0 0 0 0] [ 85 115 97 109 105 0 0 0 0 0]

# sort by length to use pack_padded_sequence


seq_lengths, perm_idx = seq_lengths.sort(dim=0, descending=True)
seq_tensor = seq_tensor[perm_idx]
countries = countries[perm_idx]

return create_tensor(seq_tensor), \
create_tensor(seq_lengths),\
create_tensor(countries)

Lecturer : Hongpu Liu Lecture 13-85 PyTorch Tutorial @ SLAM Research Group
Implementation – Convert name to tensor
def make_tensors(names, countries):
sequences_and_lengths = [name2list(name) for name in names]
name_sequences = [sl[0] for sl in sequences_and_lengths]
seq_lengths = torch.LongTensor([sl[1] for sl in sequences_and_lengths])
countries = countries.long()

# make tensor of name, BatchSize x SeqLen


seq_tensor = torch.zeros(len(name_sequences), seq_lengths.max()).long()
for idx, (seq, seq_len) in enumerate(zip(name_sequences, seq_lengths), 0):
seq_tensor[idx, :seq_len] = torch.LongTensor(seq)

# sort by length to use pack_padded_sequence


seq_lengths, perm_idx = seq_lengths.sort(dim=0, descending=True)
seq_tensor = seq_tensor[perm_idx]
countries = countries[perm_idx]
def create_tensor(tensor):
if USE_GPU:
return create_tensor(seq_tensor), \ device = torch.device("cuda:0")
create_tensor(seq_lengths),\ tensor = tensor.to(device)
create_tensor(countries) return tensor

Lecturer : Hongpu Liu Lecture 13-86 PyTorch Tutorial @ SLAM Research Group
Implementation – One Epoch Training

def trainModel():
total_loss = 0
for i, (names, countries) in enumerate(trainloader, 1):
inputs, seq_lengths, target = make_tensors(names, countries)
output = classifier(inputs, seq_lengths)
loss = criterion(output, target)
optimizer.zero_grad()
loss.backward()
optimizer.step()
1. forward – compute output of model
total_loss += loss.item()
2. forward – compute loss
if i % 10 == 0:
3. zero grad Epoch {epoch} ', end='')
print(f'[{time_since(start)}]
print(f'[{i * len(inputs)}/{len(trainset)}]
4. backward ', end='')
print(f'loss={total_loss / (i * len(inputs))}')
return total_loss 5. update

Lecturer : Hongpu Liu Lecture 13-87 PyTorch Tutorial @ SLAM Research Group
Implementation – One Epoch Training

def trainModel():
total_loss = 0
for i, (names, countries) in enumerate(trainloader, 1):
inputs, seq_lengths, target = make_tensors(names, countries)
output = classifier(inputs, seq_lengths)
loss = criterion(output, target)
optimizer.zero_grad()
loss.backward()
optimizer.step()

total_loss += loss.item()
if i % 10 == 0:
print(f'[{time_since(start)}] Epoch {epoch} ', end='')
print(f'[{i * len(inputs)}/{len(trainset)}] ', end='')
print(f'loss={total_loss / (i * len(inputs))}')
return total_loss

Lecturer : Hongpu Liu Lecture 13-88 PyTorch Tutorial @ SLAM Research Group
Implementation – Testing

def testModel():
correct = 0
total = len(testset)
print("evaluating trained model ...")
with torch.no_grad():
for i, (names, countries) in enumerate(testloader, 1):
inputs, seq_lengths, target = make_tensors(names, countries)
output = classifier(inputs, seq_lengths)
pred = output.max(dim=1, keepdim=True)[1]
correct += pred.eq(target.view_as(pred)).sum().item()

percent = '%.2f' % (100 * Tell PyTorch


correct not to compute gradient of
/ total)
print(f'Test set: Accuracycomputational
{correct}/{total}
graph.{percent}%')
return correct / total Saving time and memory!

Lecturer : Hongpu Liu Lecture 13-89 PyTorch Tutorial @ SLAM Research Group
Implementation – Testing

def testModel():
correct = 0
total = len(testset)
print("evaluating trained model ...")
with torch.no_grad():
for i, (names, countries) in enumerate(testloader, 1):
inputs, seq_lengths, target = make_tensors(names, countries)
output = classifier(inputs, seq_lengths)
pred = output.max(dim=1, keepdim=True)[1]
Compute the output of the model.
correct += pred.eq(target.view_as(pred)).sum().item()

percent = '%.2f' % (100 * correct / total)


print(f'Test set: Accuracy {correct}/{total} {percent}%')

return correct / total

Lecturer : Hongpu Liu Lecture 13-90 PyTorch Tutorial @ SLAM Research Group
Implementation – Testing

def testModel():
correct = 0
total = len(testset)
print("evaluating trained model ...")
with torch.no_grad():
for i, (names, countries) in enumerate(testloader, 1):
inputs, seq_lengths, target = make_tensors(names, countries)
output = classifier(inputs, seq_lengths)
pred = output.max(dim=1, keepdim=True)[1]
correct += pred.eq(target.view_as(pred)).sum().item()

percent = '%.2f' % (100 * Compute


correct /number
total) of predicted correctly.
print(f'Test set: Accuracy {correct}/{total} {percent}%')

return correct / total

Lecturer : Hongpu Liu Lecture 13-91 PyTorch Tutorial @ SLAM Research Group
Implementation – Result

Lecturer : Hongpu Liu Lecture 13-92 PyTorch Tutorial @ SLAM Research Group
Exercise 13-1 Sentiment Analysis on Movie Reviews

• The Rotten Tomatoes movie review dataset is a corpus of movie reviews used for
sentiment analysis.
• dataset: https://www.kaggle.com/c/sentiment-analysis-on-movie-reviews/data
• The dataset is comprised of tab-separated files with phrases from the Rotten Tomatoes
dataset.

• The sentiment labels are:


• 0 - negative
• 1 - somewhat negative
• 2 - neutral
• 3 - somewhat positive
• 4 - positive

Lecturer : Hongpu Liu Lecture 13-93 PyTorch Tutorial @ SLAM Research Group
Exercise 13-1 Sentiment Analysis on Movie Reviews

Lecturer : Hongpu Liu Lecture 13-94 PyTorch Tutorial @ SLAM Research Group
Exercise 13-1 Sentiment Analysis on Movie Reviews

PhraseId SentenceId Phrase Sentiment


A series of escapades demonstrating the adage that
what is good for the goose is also good for the gander ,
1 1 1
some of which occasionally amuses but none of which
amounts to much of a story .
A series of escapades demonstrating the adage that
2 1 2
what is good for the goose
3 1 A series 2
4 1 A 2
5 1 series 2
of escapades demonstrating the adage that what is
6 1 2
good for the goose
7 1 of 2
escapades demonstrating the adage that what is good
8 1 2
for the goose
9 1 escapades 2
demonstrating the adage that what is good for the
10 1 2
goose

Lecturer : Hongpu Liu Lecture 13-95 PyTorch Tutorial @ SLAM Research Group
PyTorch Tutorial
13. RNN Classifier

Lecturer : Hongpu Liu Lecture 13-96 PyTorch Tutorial @ SLAM Research Group

You might also like