Implementation of a CNN based Image Classifier using PyTorch
Last Updated :
31 Jul, 2025
Deep learning has revolutionized computer vision applications making it possible to classify and interpret images with good accuracy. We will perform a practical step-by-step implementation of a convolutional neural network (CNN) for image classification using PyTorch on CIFAR-10 dataset.
Step 1: Importing Libraries and Setting Up
To build our model, we first import PyTorch libraries and prepare the environment for visualization and data handling.
- torch (PyTorch): Enables building, training and running deep learning models using tensors.
- torchvision: Supplies standard vision datasets, image transforms and visualization utilities.
- matplotlib.pyplot: Plots images, graphs and visual representations of data and results.
- numpy: Provides efficient array operations and mathematical utilities for data processing.
- ssl: Adjusts security settings to bypass certificate errors during dataset downloads.
- Set up global plot parameters and SSL context to prevent download errors.
Python
import torch
import torchvision
import matplotlib.pyplot as plt
import numpy as np
import ssl
ssl._create_default_https_context = ssl._create_unverified_context
plt.rcParams['figure.figsize'] = 14, 6
Output:
Installation of Required ModulesWe define a normalization transformation, scaling pixel values to have mean 0.5 and standard deviation 0.5 per channel. We then download and load the CIFAR-10 dataset for both training and testing, applying the transform.
Python
normalize_transform = torchvision.transforms.Compose([
torchvision.transforms.ToTensor(),
torchvision.transforms.Normalize(mean=(0.5, 0.5, 0.5), std=(0.5, 0.5, 0.5))
])
train_dataset = torchvision.datasets.CIFAR10(
root="./CIFAR10/train", train=True, transform=normalize_transform, download=True)
test_dataset = torchvision.datasets.CIFAR10(
root="./CIFAR10/test", train=False, transform=normalize_transform, download=True)
Step 3: Creating Data Loaders
- Set batch size to 128 for efficiency.
- Create data loaders for both train and test sets to manage batching and easy iteration.
Python
batch_size = 128
train_loader = torch.utils.data.DataLoader(
train_dataset, batch_size=batch_size)
test_loader = torch.utils.data.DataLoader(test_dataset, batch_size=batch_size)
Step 4: Visualizing Sample Images
- Obtain a batch of images and labels from the train loader.
- Display a grid of 25 training images for visual confirmation of the data pipeline.
Python
dataiter = iter(train_loader)
images, labels = next(dataiter)
plt.imshow(np.transpose(torchvision.utils.make_grid(
images[:25], normalize=True, padding=1, nrow=5).numpy(), (1, 2, 0)))
plt.axis('off')
plt.show()
Output:
Sample Data for TrainingStep 5: Analyzing Dataset Class Distribution
- Collect all class labels from training data.
- Count occurrences for every class and visualize with a bar chart, revealing class balance.
Python
classes = []
for batch_idx, data in enumerate(train_loader):
x, y = data
classes.extend(y.tolist())
unique, counts = np.unique(classes, return_counts=True)
names = list(test_dataset.class_to_idx.keys())
plt.bar(names, counts)
plt.xlabel("Target Classes")
plt.ylabel("Number of training instances")
plt.show()
Output:
Unique Classes and Respective CountsStep 6: Building the CNN Architecture
Build a convolutional neural network (CNN) using PyTorch modules:
- Three sets of convolution, activation (ReLU) and max pooling layers.
- Flatten the features and add two fully connected layers.
- Output layer predicts class scores for 10 classes.
Python
class CNN(torch.nn.Module):
def __init__(self):
super().__init__()
self.model = torch.nn.Sequential(
torch.nn.Conv2d(in_channels=3, out_channels=32,
kernel_size=3, padding=1),
torch.nn.ReLU(),
torch.nn.MaxPool2d(kernel_size=2),
torch.nn.Conv2d(in_channels=32, out_channels=64,
kernel_size=3, padding=1),
torch.nn.ReLU(),
torch.nn.MaxPool2d(kernel_size=2),
torch.nn.Conv2d(in_channels=64, out_channels=64,
kernel_size=3, padding=1),
torch.nn.ReLU(),
torch.nn.MaxPool2d(kernel_size=2),
torch.nn.Flatten(),
torch.nn.Linear(64 * 4 * 4, 512),
torch.nn.ReLU(),
torch.nn.Linear(512, 10)
)
def forward(self, x):
return self.model(x)
Step 7: Configuring the Training Process
- Select computation device: GPU if available, otherwise CPU.
- Instantiate the model and move it to the selected device.
- Number of training epochs (50)
Python
device = 'cuda' if torch.cuda.is_available() else 'cpu'
model = CNN().to(device)
num_epochs = 50
learning_rate = 0.001
weight_decay = 0.01
criterion = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(
model.parameters(), lr=learning_rate, weight_decay=weight_decay)
Step 8: Training the Model
- Train the CNN through all epochs.
- Set model to training mode.
- For each batch, move data to device, compute predictions and loss, backpropagate and update parameters.
- Accumulate and record mean loss per epoch.
Python
train_loss_list = []
for epoch in range(num_epochs):
print(f'Epoch {epoch+1}/{num_epochs}:', end=' ')
train_loss = 0
model.train()
for images, labels in train_loader:
images = images.to(device)
labels = labels.to(device)
outputs = model(images)
loss = criterion(outputs, labels)
optimizer.zero_grad()
loss.backward()
optimizer.step()
train_loss += loss.item()
train_loss_list.append(train_loss / len(train_loader))
print(f"Training loss = {train_loss_list[-1]}")
Output:
TrainingStep 9: Plotting Training Loss
Visualizing the learning curve by plotting average loss against every epoch.
Python
plt.plot(range(1, num_epochs + 1), train_loss_list)
plt.xlabel("Number of epochs")
plt.ylabel("Training loss")
plt.show()
Output:
Training LossStep 10: Evaluating Model Accuracy
- Switch model to evaluation mode and disable gradient calculations.
- For each test batch, compute predictions and accumulate number of correct classifications.
- Calculate and print total accuracy as percentage of correctly classified test images.
Python
test_acc = 0
model.eval()
with torch.no_grad():
for images, labels in test_loader:
images = images.to(device)
y_true = labels.to(device)
outputs = model(images)
_, y_pred = torch.max(outputs.data, 1)
test_acc += (y_pred == y_true).sum().item()
print(f"Test set accuracy = {100 * test_acc / len(test_dataset)} %")
Output: Test set accuracy = 71.94 %
Step 11: Visualizing Model Predictions
- From a test batch, select a few images and gather their actual and predicted class names.
- Show these images using a grid, with a title indicating both actual and predicted labels.
Python
num_images = 5
y_true_name = [names[y_true[idx]] for idx in range(num_images)]
y_pred_name = [names[y_pred[idx]] for idx in range(num_images)]
title = f"Actual labels: {y_true_name}, Predicted labels: {y_pred_name}"
plt.imshow(np.transpose(torchvision.utils.make_grid(
images[:num_images].cpu(), normalize=True, padding=1).numpy(), (1, 2, 0)))
plt.title(title)
plt.axis("off")
plt.show()
Output:
Final OutputWe can see that our model is working fine and making right predictions.
Explore
Machine Learning Basics
Python for Machine Learning
Feature Engineering
Supervised Learning
Unsupervised Learning
Model Evaluation and Tuning
Advanced Techniques
Machine Learning Practice