02_asl.ipynb (4) - JupyterLab
02_asl.ipynb (4) - JupyterLab
Header
2.1 Objectives
Out[1]: True
100.28.217.36/lab/lab/tree/02_asl.ipynb 1/15
3/13/25, 9:53 AM 02_asl
2.2.1 Kaggle
This dataset is available from the website Kaggle, which is a fantastic place to find datasets and other deep learning resources. In
addition to providing resources like datasets and "kernels" that are like these notebooks, Kaggle hosts competitions that you can
take part in, competing with others in training highly accurate models.
If you're looking to practice or see examples of many deep learning projects, Kaggle is a great site to visit.
To load and work with the data, we'll be using a library called Pandas, which is a highly performant tool for loading and
manipulating data. We'll read the CSV files into a format called a DataFrame.
Pandas has a read_csv method that expects a csv file, and returns a DataFrame:
100.28.217.36/lab/lab/tree/02_asl.ipynb 2/15
3/13/25, 9:53 AM 02_asl
In [3]: train_df.head()
Out[3]: label pixel1 pixel2 pixel3 pixel4 pixel5 pixel6 pixel7 pixel8 pixel9 ... pixel775 pixel776 pixel777 pixel778 pixel7
0 3 107 118 127 134 139 143 146 150 153 ... 207 207 207 207 2
1 6 155 157 156 156 156 157 156 158 158 ... 69 149 128 87
2 2 187 188 188 187 187 186 187 188 187 ... 202 201 200 199 1
3 2 211 211 212 212 211 210 211 210 210 ... 235 234 233 231 2
4 12 164 167 170 172 176 179 180 184 185 ... 92 105 105 108 1
Out[4]: 0 3
1 6
2 2
3 2
4 12
..
27450 12
27451 22
27452 17
27453 16
27454 22
Name: label, Length: 27455, dtype: int64
100.28.217.36/lab/lab/tree/02_asl.ipynb 3/15
3/13/25, 9:53 AM 02_asl
In [6]: x_train.shape
In [7]: y_train.shape
Out[7]: (27455,)
In [8]: x_valid.shape
100.28.217.36/lab/lab/tree/02_asl.ipynb 4/15
3/13/25, 9:53 AM 02_asl
In [9]: y_valid.shape
Out[9]: (7172,)
Note that we'll have to reshape the data from its current 1D shape of 784 pixels, to a 2D shape of 28x28 pixels to make sense of the
image:
num_images = 20
for i in range(num_images):
row = x_train[i]
label = y_train[i]
image = row.reshape(28,28)
plt.subplot(1, num_images, i+1)
plt.title(label, fontdict={'fontsize': 30})
plt.axis('off')
plt.imshow(image, cmap='gray')
In [11]: x_train.min()
100.28.217.36/lab/lab/tree/02_asl.ipynb 5/15
3/13/25, 9:53 AM 02_asl
Out[11]: 0
In [12]: x_train.max()
Out[12]: 255
In the previous lab, we used ToTensor, but we can also modify our data before turning it into a tensor.
Since our dataset is small enough, we can store it on our GPU for faster processing. In the previous lab, we sent our data to the GPU
when it was drawn from each batch. Here, we will send it to the GPU in the __init__ function.
def __len__(self):
return len(self.xs)
A custom PyTorch dataset works just like a prebuilt one. It should be passed to a DataLoader for model training.
In [15]: BATCH_SIZE = 32
100.28.217.36/lab/lab/tree/02_asl.ipynb 6/15
3/13/25, 9:53 AM 02_asl
We can verify the DataLoader works as expected with the code below. We'll make the DataLoader iterable, and then call next to
draw the first hand from the deck.
In [17]: train_loader
Try running the below a few times. The values should change each time.
Notice the batch has two values. The first is our x , and the second is our y . The first dimension of each should have 32 values,
which is the batch_size .
In [19]: batch[0].shape
In [20]: batch[1].shape
100.28.217.36/lab/lab/tree/02_asl.ipynb 7/15
3/13/25, 9:53 AM 02_asl
Out[20]: torch.Size([32])
Exercise
For this exercise we are going to build a sequential model. Just like last time, build a model that:
In [21]: input_size = 28 * 28
n_classes = 24
Do your work in the cell below, creating a model variable to store the model. We've imported the Sequental model class and
Linear layer class to get you started. Reveal the solution below for a hint:
Solution
In [23]: # SOLUTION
model = nn.Sequential(
nn.Flatten(),
nn.Linear(input_size, 512), # Input
nn.ReLU(), # Activation for input
nn.Linear(512, 512), # Hidden
nn.ReLU(), # Activation for hidden
100.28.217.36/lab/lab/tree/02_asl.ipynb 8/15
3/13/25, 9:53 AM 02_asl
This time, we'll combine compiling the model and sending it to the GPU in one step:
Out[24]: OptimizedModule(
(_orig_mod): Sequential(
(0): Flatten(start_dim=1, end_dim=-1)
(1): Linear(in_features=784, out_features=512, bias=True)
(2): ReLU()
(3): Linear(in_features=512, out_features=512, bias=True)
(4): ReLU()
(5): Linear(in_features=512, out_features=24, bias=True)
)
)
Since categorizing these ASL images is similar to categorizing MNIST's handwritten digits, we will use the same loss_function
(Categorical CrossEntropy) and optimizer (Adam). nn.CrossEntropyLoss includes the softmax function, and is
computationally faster when passing class indices as opposed to predicted probabilities.
Before looping through the DataLoader, we will set the model to model.train to make sure its parameters can be updated. To make
it easier for us to follow training progress, we'll keep track of the total loss and accuracy .
100.28.217.36/lab/lab/tree/02_asl.ipynb 9/15
3/13/25, 9:53 AM 02_asl
model.train()
for x, y in train_loader:
output = model(x)
optimizer.zero_grad()
batch_loss = loss_function(output, y)
batch_loss.backward()
optimizer.step()
loss += batch_loss.item()
accuracy += get_batch_accuracy(output, y, train_N)
print('Train - Loss: {:.4f} Accuracy: {:.4f}'.format(loss, accuracy))
One key difference is we will set the model to evaluation mode with model.evaluate, which will prevent the model from updating
any parameters.
model.eval()
100.28.217.36/lab/lab/tree/02_asl.ipynb 10/15
3/13/25, 9:53 AM 02_asl
with torch.no_grad():
for x, y in valid_loader:
output = model(x)
Exercise
The function below has three FIXME s. Each one corresponds to the functions input arguments. Can you replace each FIXME with
the correct argument?
It may help to view the documentation for argmax, eq, and view_as.
Solution
In [29]: # SOLUTION
def get_batch_accuracy(output, y, N):
pred = output.argmax(dim=1, keepdim=True)
correct = pred.eq(y.view_as(pred)).sum().item()
return correct / N
100.28.217.36/lab/lab/tree/02_asl.ipynb 11/15
3/13/25, 9:53 AM 02_asl
In [30]: epochs = 20
100.28.217.36/lab/lab/tree/02_asl.ipynb 12/15
3/13/25, 9:53 AM 02_asl
Epoch: 0
Train - Loss: 1585.5740 Accuracy: 0.3956
Valid - Loss: 323.2296 Accuracy: 0.5409
Epoch: 1
Train - Loss: 729.8412 Accuracy: 0.7089
Valid - Loss: 273.1569 Accuracy: 0.6051
Epoch: 2
Train - Loss: 392.3283 Accuracy: 0.8469
Valid - Loss: 229.9985 Accuracy: 0.6694
Epoch: 3
Train - Loss: 213.5582 Accuracy: 0.9214
Valid - Loss: 223.7834 Accuracy: 0.7375
Epoch: 4
Train - Loss: 133.0160 Accuracy: 0.9525
Valid - Loss: 206.5498 Accuracy: 0.7670
Epoch: 5
Train - Loss: 70.6900 Accuracy: 0.9776
Valid - Loss: 272.6428 Accuracy: 0.7375
Epoch: 6
Train - Loss: 90.4645 Accuracy: 0.9663
Valid - Loss: 267.6985 Accuracy: 0.7270
Epoch: 7
Train - Loss: 40.0967 Accuracy: 0.9885
Valid - Loss: 226.3849 Accuracy: 0.7741
Epoch: 8
Train - Loss: 53.9879 Accuracy: 0.9817
Valid - Loss: 219.3163 Accuracy: 0.7987
Epoch: 9
Train - Loss: 50.4869 Accuracy: 0.9823
Valid - Loss: 220.8031 Accuracy: 0.8133
Epoch: 10
Train - Loss: 2.9627 Accuracy: 0.9999
Valid - Loss: 235.2547 Accuracy: 0.8108
Epoch: 11
Train - Loss: 79.4522 Accuracy: 0.9765
Valid - Loss: 291.3555 Accuracy: 0.7305
Epoch: 12
Train - Loss: 17.9432 Accuracy: 0.9954
Valid - Loss: 247.6815 Accuracy: 0.7829
Epoch: 13
Train - Loss: 57.6867 Accuracy: 0.9796
Valid - Loss: 229.8657 Accuracy: 0.8002
100.28.217.36/lab/lab/tree/02_asl.ipynb 13/15
3/13/25, 9:53 AM 02_asl
Epoch: 14
Train - Loss: 2.0873 Accuracy: 1.0000
Valid - Loss: 237.5475 Accuracy: 0.8115
Epoch: 15
Train - Loss: 67.7856 Accuracy: 0.9788
Valid - Loss: 220.0935 Accuracy: 0.8069
Epoch: 16
Train - Loss: 3.8451 Accuracy: 0.9997
Valid - Loss: 245.0788 Accuracy: 0.8021
Epoch: 17
Train - Loss: 55.6508 Accuracy: 0.9807
Valid - Loss: 222.9318 Accuracy: 0.8150
Epoch: 18
Train - Loss: 1.8120 Accuracy: 1.0000
Valid - Loss: 245.3994 Accuracy: 0.8134
Epoch: 19
Train - Loss: 1.0882 Accuracy: 1.0000
Valid - Loss: 250.8158 Accuracy: 0.8116
Think about it for a bit before clicking on the '...' below to reveal the answer.
# SOLUTION This is an example of the model learning to categorize the training data, but performing poorly against new data that
it has not been trained on. Essentially, it is memorizing the dataset, but not gaining a robust and general understanding of the
problem. This is a common issue called overfitting. We will discuss overfitting in the next two lectures, as well as some ways to
address it.
2.7 Summary
In this section you built your own neural network to perform image classification that is quite accurate. Congrats!
At this point we should be getting somewhat familiar with the process of loading data (including labels), preparing it, creating a
model, and then training the model with prepared data.
100.28.217.36/lab/lab/tree/02_asl.ipynb 14/15
3/13/25, 9:53 AM 02_asl
2.7.2 Next
Now that you have built some very basic, somewhat effective models, we will begin to learn about more sophisticated models,
including Convolutional Neural Networks.
Header
100.28.217.36/lab/lab/tree/02_asl.ipynb 15/15