0% found this document useful (0 votes)

13 views13 pages

Module3 Casestudy

The VGG16 model has 16 layers with learnable parameters consisting of convolution and fully connected layers. The input is 224x224 RGB images. It uses multiple 3x3 filters in succession instead of large filters like AlexNet and contains about 138 million parameters.

Uploaded by

Devarenjini P

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views13 pages

Module3 Casestudy

Uploaded by

Devarenjini P

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 13

Case study: Architecture of LeNet

LeNet refers to LeNet-5 and is a simple convolutional neural network.

What is Lenet5?
Lenet-5 is one of the earliest pre-trained models proposed by Yann
LeCun and others in the year 1998, in the research paper Gradient-Based
Learning Applied to Document Recognition. They used this architecture
for recognizing the handwritten and machine-printed characters.
The Architecture of the Model
Let’s understand the architecture of Lenet-5. The network has 5 layers
with learnable parameters and hence named Lenet-5. It has three sets of
convolution layers with a combination of average pooling. After the
convolution and average pooling layers, we have two fully connected
layers. At last, a Softmax classifier which classifies the images into
respective class.

The input to this model is a 32 X 32 gray scale image hence the number
of channels is one.
H1=32-5+1=28

W1=32-5+1=28

We then apply the first convolution operation with the filter size 5X5
and we have 6 such filters. As a result, we get a feature map of size
28X28X6. Here the number of channels is equal to the number of filters
applied.

2x2 window (28-2)/2+1=14

After the first convolution operation, we apply the average pooling and
the size of the feature map is reduced by half. Note that, the number of
channels is intact.
Next, we have a convolution layer with sixteen filters of size 5X5. Again
the feature map changed it is 10X10X16. The output size is calculated in
a similar manner. After this, we again applied an average pooling or
subsampling layer, which again reduce the size of the feature map by
half i.e 5X5X16. (10-2)/2+1=5

Then we have a final convolution layer of size 5X5 with 120 filters. As
shown in the above image the feature map size 1X1X120. After which
flatten result is 120 values.

After these convolution layers, we have a fully connected layer with

eighty-four neurons. At last, we have an output layer with ten neurons
since the data have ten classes.
Here is the final architecture of the Lenet-5 model.

Architecture Details

The first layer is the input layer with feature map size 32X32X1.

Then we have the first convolution layer with 6 filters of size 5X5 and
stride is 1. The activation function used at his layer is tanh. The output
feature map is 28X28X6.

Next, we have an average pooling layer with filter size 2X2 and stride 1.
The resulting feature map is 14X14X6. Since the pooling layer doesn’t
affect the number of channels.

After this comes the second convolution layer with 16 filters of 5X5 and
stride 1. Also, the activation function is tanh. Now the output size is
10X10X16.

Again comes the other average pooling layer of 2X2 with stride 2. As a
result, the size of the feature map reduced to 5X5X16.

The final pooling layer has 120 filters of 5X5 with stride 1 and
activation function tanh. Now the output size is 120.
The next is a fully connected layer with 84 neurons that result in the
output to 84 values and the activation function used here is again tanh.

The last layer is the output layer with 10 neurons and Softmax function.
The Softmax gives the probability that a data point belongs to a
particular class. The highest value is then predicted.

This is the entire architecture of the Lenet-5 model. The number of

trainable parameters of this architecture is around sixty thousand.

To summarize… The network has

 5 layers with learnable parameters.

 The input to the model is a gray scale image.
 It has 3 convolution layers, two average pooling layers, and two fully
connected layers with a softmax classifier.
 The number of trainable parameters is 60000.
Architecture of AlexNet
What is AlexNet?
Alex Net is the name given to a Convolutional Neural Network
Architecture that won the LSVRC competition in 2012.

LSVRC (Large Scale Visual Recognition Challenge) is competition where

research teams evaluate their algorithms on a huge dataset of labeled
images (ImageNet) and compete to achieve high accuracy on
several visual recognition tasks.

The Alex Net contains 8 layers with weights; 5

convolutional layers 3 fully connected layers.

The Alexnet has eight layers with learnable parameters. The model consists
of five layers with a combination of max pooling followed by 3 fully connected
layers and they use Relu activation in each of these layers except the output
layer.

They found out that using the relu as an activation function accelerated the
speed of the training process by almost six times. They also used the dropout
layers that prevented their model from overfitting. Further, the model is trained
on the Imagenet dataset. The Imagenet dataset has almost 14 million images
across a thousand classes.

Since Alexnet is a deep architecture, the authors introduced padding to

prevent the size of the feature maps from reducing drastically. The input to
this model is the images of size 227X227X3.
Then we apply the first convolution layer with 96 filters of size 11X11 with
stride 4. The activation function used in this layer is relu. The output feature
map is 55X55X96.

In case, you are unaware of how to calculate the output size of a convolution
layer

output= ((Input-filter size)/ stride)+1

Also, the number of filters becomes the channel in the output feature map.

Next, we have the first Maxpooling layer, of size 3X3 and stride 2. Then we
get the resulting feature map with the size 27X27X96.

After this, we apply the second convolution operation. This time the filter size
is reduced to 5X5 and we have 256 such filters. The stride is 1 and padding 2.
The activation function used is again relu. Now the output size we get is
27X27X256.

Again we applied a max-pooling layer of size 3X3 with stride 2. The resulting
feature map is of shape 13X13X256.

Now we apply the third convolution operation with 384 filters of size 3X3 stride
1 and also padding 1. Again the activation function used is relu. The output
feature map is of shape 13X13X384.

Then we have the fourth convolution operation with 384 filters of size 3X3.
The stride along with the padding is 1. On top of that activation function used
is relu. Now the output size remains unchanged i.e 13X13X384.

After this, we have the final convolution layer of size 3X3 with 256 such filters.
The stride and padding are set to one also the activation function is relu. The
resulting feature map is of shape 13X13X256.
So if you look at the architecture till now, the number of filters is increasing as
we are going deeper. Hence it is extracting more features as we move deeper
into the architecture. Also, the filter size is reducing, which means the initial
filter was larger and as we go ahead the filter size is decreasing, resulting in a
decrease in the feature map shape.

Next, we apply the third max-pooling layer of size 3X3 and stride 2. Resulting
in the feature map of the shape 6X6X256.

227x227x3 55x55x96 27x27x96 27x 27x256 13x13x256 13x13x384

13x13x384 13x13x256 6x6x256

After this, we have our first dropout layer. The drop-out rate is set to be 0.5.

Then we have the first fully connected layer with a relu activation function. The
size of the output is 4096. Next comes another dropout layer with the dropout
rate fixed at 0.5.

This followed by a second fully connected layer with 4096 neurons and relu
activation.

Finally, we have the last fully connected layer or output layer with 1000
neurons as we have 1000 classes in the data set. The activation function used
at this layer is Softmax.

This is the architecture of the Alexnet model. It has a total of 62.3 million
learnable parameters.

To summarize the architecture

It has 8 layers with learnable parameters.

 The input to the Model is RGB images.

 It has 5 convolution layers with a combination of max-pooling layers.
 Then it has 3 fully connected layers.
 The activation function used in all layers is Relu.
 It used two Dropout layers.
 The activation function used in the output layer is Softmax.
 The total number of parameters in this architecture is 62.3 million.
VGG 16 Architecture
VGG stands for Visual Geometry Group (a group of researchers at Oxford who
developed this architecture). The VGG architecture consists of blocks, where each
block is composed of 2D Convolution and Max Pooling layers. VGGNet comes in
two flavors, VGG16 and VGG19, where 16 and 19 are the number of layers in each
of them respectively.

VGG16 is a convolutional neural network model proposed by K. Simonyan and A.

Zisserman from the University of Oxford in the paper “Very Deep Convolutional
Networks for Large-Scale Image Recognition”. The model achieves 92.7% top-5
test accuracy in ImageNet, which is a dataset of over 14 million images belonging
to 1000 classes. It makes the improvement over AlexNet by replacing large kernel-
sized filters (11 and 5 in the first and second convolutional layer, respectively) with
multiple 3×3 kernel-sized filters one after another.

The input to cov1 layer is of fixed size 224 x 224 RGB image. The image is passed
through a stack of convolutional (conv.) layers, where the filters were used with a
very small receptive field: 3×3. The convolution stride is fixed to 1 pixel; the
padding is 1-pixel for 3×3 conv. Layers.
Start with initializing the model by specifying that the model is a sequential model.
After initializing the model add

→ 2 x convolution layer of 64 channel of 3x3 kernal and same padding

→ 1 x maxpool layer of 2x2 pool size and stride 2x2

→ 2 x convolution layer of 128 channel of 3x3 kernal and same padding

→ 1 x maxpool layer of 2x2 pool size and stride 2x2

→ 3 x convolution layer of 256 channel of 3x3 kernal and same padding

→ 1 x maxpool layer of 2x2 pool size and stride 2x2

→ 3 x convolution layer of 512 channel of 3x3 kernal and same padding

→ 1 x maxpool layer of 2x2 pool size and stride 2x2

→ 3 x convolution layer of 512 channel of 3x3 kernal and same padding

→ 1 x maxpool layer of 2x2 pool size and stride 2x2

Add relu(Rectified Linear Unit) activation to each layers so that all the negative
values are not passed to the next layer.

After creating the entire convolution, pass the data to the dense layer so for that
flatten the vector which comes out of the convolutions and add

→ 1 x Dense layer of 4096 units, relu activation

→ 1 x Dense Softmax layer of 2 units

The softmax layer will output the value between 0 and 1 based on the confidence of
the model that which class the images belongs to.

Conv 3x3 filter,s=1,p=1

Maxpool window 2x2 s=2

14x14x512
(14-2)/2+1=7

16 layers of VGG16

1.Convolution using 64 filters

2.Convolution using 64 filters + Max pooling
3 Convolution using 128 filters
4.Convolution using 128 filters + Max pooling
5. Convolution using 256 filters
6. Convolution using 256 filters
7. Convolution using 256 filters + Max pooling
8. Convolution using 512 filters
9. Convolution using 512 filters
10. Convolution using 512 filters+Max pooling
11. Convolution using 512 filters
12. Convolution using 512 filters
13. Convolution using 512 filters+Max pooling
14. Fully connected with 4096 nodes
15. Fully connected with 4096 nodes
16. Output layer with Softmax activation with 1000 nodes.

AI Teachable Machine
No ratings yet
AI Teachable Machine
23 pages
ORCL - Become An OCI AI Foundations Associate (2023) Exam
No ratings yet
ORCL - Become An OCI AI Foundations Associate (2023) Exam
6 pages
DLlenet 5 Notes 1 Downl
No ratings yet
DLlenet 5 Notes 1 Downl
6 pages
Different Deep CNN Architectures - LeNet, AlexNet, VGG
No ratings yet
Different Deep CNN Architectures - LeNet, AlexNet, VGG
13 pages
dl ass 742
No ratings yet
dl ass 742
14 pages
BEFA
No ratings yet
BEFA
23 pages
Notes
No ratings yet
Notes
15 pages
Alex Net
No ratings yet
Alex Net
26 pages
unit 4 deeplearning
No ratings yet
unit 4 deeplearning
41 pages
23-CNN Operations - Architecture - Simple Convolution Network-09!09!2024
No ratings yet
23-CNN Operations - Architecture - Simple Convolution Network-09!09!2024
8 pages
Unit 2 CNN
No ratings yet
Unit 2 CNN
15 pages
ML Lec 14 LeNeT CNN Architecture
No ratings yet
ML Lec 14 LeNeT CNN Architecture
14 pages
XCXC
No ratings yet
XCXC
16 pages
DNN Architectures
No ratings yet
DNN Architectures
12 pages
DL_Unit IV
No ratings yet
DL_Unit IV
36 pages
Alex Net
No ratings yet
Alex Net
2 pages
Unit V
No ratings yet
Unit V
84 pages
Convolutional Neural Network (CNN)
No ratings yet
Convolutional Neural Network (CNN)
14 pages
Untitled document (2)
No ratings yet
Untitled document (2)
15 pages
Convolutional Neural Network2 26112024 015227pm
No ratings yet
Convolutional Neural Network2 26112024 015227pm
41 pages
Convolutional Neural Network Report
No ratings yet
Convolutional Neural Network Report
5 pages
7 Architectures
No ratings yet
7 Architectures
68 pages
465-Lecture 7 (1)
No ratings yet
465-Lecture 7 (1)
46 pages
AlexNet
No ratings yet
AlexNet
3 pages
Data Science Interview Preparation (#DAY 14)
No ratings yet
Data Science Interview Preparation (#DAY 14)
11 pages
Keras and Tensorflow
No ratings yet
Keras and Tensorflow
11 pages
DoMinhQuan_521H0290
No ratings yet
DoMinhQuan_521H0290
4 pages
LeNet Architecture
No ratings yet
LeNet Architecture
6 pages
COMP3220 Lect 11 - Introduction to Convolutional Neural Networks
No ratings yet
COMP3220 Lect 11 - Introduction to Convolutional Neural Networks
13 pages
DeepLearningAssign2
No ratings yet
DeepLearningAssign2
5 pages
System Architecture Overview
No ratings yet
System Architecture Overview
8 pages
DLP
No ratings yet
DLP
50 pages
04 - CNN Case Studies
No ratings yet
04 - CNN Case Studies
20 pages
Mổ xẻ cái AlexNet network
No ratings yet
Mổ xẻ cái AlexNet network
5 pages
DLle Net 5 Notes 2 Downl
No ratings yet
DLle Net 5 Notes 2 Downl
3 pages
Ch-3 Convolutional Neural Networks (CNNs)
No ratings yet
Ch-3 Convolutional Neural Networks (CNNs)
11 pages
ML Lec 15 Alexnet CNN
No ratings yet
ML Lec 15 Alexnet CNN
8 pages
Unit-3
No ratings yet
Unit-3
38 pages
Convolutional Neural Networks
No ratings yet
Convolutional Neural Networks
15 pages
Unit 5a - Machine Vision
No ratings yet
Unit 5a - Machine Vision
55 pages
Unit-3 (1)
No ratings yet
Unit-3 (1)
37 pages
lenet
No ratings yet
lenet
1 page
Modern Convolutional Neural Networks
No ratings yet
Modern Convolutional Neural Networks
68 pages
Modern CNN Architectures
No ratings yet
Modern CNN Architectures
32 pages
Convolutional Neural Network Ilsvrc Alexnet (2012) Zfnet (2013) Vggnet (2014) Googlenet 2014) Resnet (2015) Conclusion
No ratings yet
Convolutional Neural Network Ilsvrc Alexnet (2012) Zfnet (2013) Vggnet (2014) Googlenet 2014) Resnet (2015) Conclusion
82 pages
DL3 QB
No ratings yet
DL3 QB
19 pages
Difference Between Alexnet, Vggnet, Resnet, and Inception
No ratings yet
Difference Between Alexnet, Vggnet, Resnet, and Inception
14 pages
Convolution Neural Networks
No ratings yet
Convolution Neural Networks
80 pages
Mobile Net
No ratings yet
Mobile Net
9 pages
Transfer Learning - CNN Architectures
No ratings yet
Transfer Learning - CNN Architectures
120 pages
ML II - Unit IV
No ratings yet
ML II - Unit IV
20 pages
ch4_CNN
No ratings yet
ch4_CNN
35 pages
Convolutional Networks
No ratings yet
Convolutional Networks
211 pages
2023 AN2DL Lez 4 CNN Famous Architectures
No ratings yet
2023 AN2DL Lez 4 CNN Famous Architectures
113 pages
cnn (1)_unit 3_merged
No ratings yet
cnn (1)_unit 3_merged
14 pages
Trustworthy - Final Essay
No ratings yet
Trustworthy - Final Essay
21 pages
Lecture 4
No ratings yet
Lecture 4
22 pages
Unit 5
No ratings yet
Unit 5
24 pages
Difference of LeNet and AlexNet
No ratings yet
Difference of LeNet and AlexNet
11 pages
Transfer Learning
No ratings yet
Transfer Learning
15 pages
Mesh Generation: Advances and Applications in Computer Vision Mesh Generation
From Everand
Mesh Generation: Advances and Applications in Computer Vision Mesh Generation
Fouad Sabry
No ratings yet
A Friendly Introduction to MATLAB Programming
From Everand
A Friendly Introduction to MATLAB Programming
Orhan Gazi
No ratings yet
Neural Networks in Fabric Engineering
No ratings yet
Neural Networks in Fabric Engineering
18 pages
Syllabus For CSC 578 - Neural Networks and Deep Learning, Spring 2021
No ratings yet
Syllabus For CSC 578 - Neural Networks and Deep Learning, Spring 2021
3 pages
Syllabus ADaSci Certified Generative AI Engineer
No ratings yet
Syllabus ADaSci Certified Generative AI Engineer
3 pages
AI - Human Computer Interaction Quiz - June 2024
No ratings yet
AI - Human Computer Interaction Quiz - June 2024
14 pages
Paper 91-Comparative Evaluation of CNN Architectures
No ratings yet
Paper 91-Comparative Evaluation of CNN Architectures
9 pages
جزوه هوش مصنوعی
No ratings yet
جزوه هوش مصنوعی
16 pages
Deep Dive Pytorch
No ratings yet
Deep Dive Pytorch
986 pages
JNTUK R20 UNIT-IV DEEP LEARNING TECHNIQUES (WWW - Jntumaterials.co - In)
No ratings yet
JNTUK R20 UNIT-IV DEEP LEARNING TECHNIQUES (WWW - Jntumaterials.co - In)
26 pages
Unit 4
No ratings yet
Unit 4
12 pages
Applied Deep Learning - Part 3 - Autoencoders - by Arden Dertat - Towards Data Science
No ratings yet
Applied Deep Learning - Part 3 - Autoencoders - by Arden Dertat - Towards Data Science
20 pages
HYBRID MODEL
No ratings yet
HYBRID MODEL
9 pages
Qb Cse3348 Genrative Ai-1
No ratings yet
Qb Cse3348 Genrative Ai-1
3 pages
BITS_F312_1334_20240731165555
No ratings yet
BITS_F312_1334_20240731165555
3 pages
1 - Intro To Neural Network
No ratings yet
1 - Intro To Neural Network
12 pages
Presentation FYP
No ratings yet
Presentation FYP
18 pages
Model Questions DWT
No ratings yet
Model Questions DWT
3 pages
Unit 3 DLT
No ratings yet
Unit 3 DLT
10 pages
Deep Learning Notes
No ratings yet
Deep Learning Notes
14 pages
Artistic Intelligence at Kunstverein Hannover
No ratings yet
Artistic Intelligence at Kunstverein Hannover
4 pages
MODULE 1 INTRO
No ratings yet
MODULE 1 INTRO
32 pages
CI-9 Networks Based on Competition - Fixed Weight Networks
No ratings yet
CI-9 Networks Based on Competition - Fixed Weight Networks
24 pages
Sep - 2024
No ratings yet
Sep - 2024
1 page
Lecture 8- Artificial Neural Networks
No ratings yet
Lecture 8- Artificial Neural Networks
41 pages
NNDL Record Final
No ratings yet
NNDL Record Final
46 pages
ANN syllabus
No ratings yet
ANN syllabus
2 pages
Introduction To Machine Learning 7 PDF Free
No ratings yet
Introduction To Machine Learning 7 PDF Free
36 pages
Unit 4
No ratings yet
Unit 4
9 pages
ATC-Alat Berat Part 4
No ratings yet
ATC-Alat Berat Part 4
18 pages

Module3 Casestudy

Uploaded by

Module3 Casestudy

Uploaded by

Case study: Architecture of LeNet

LeNet refers to LeNet-5 and is a simple convolutional neural network.

2x2 window (28-2)/2+1=14

After these convolution layers, we have a fully connected layer with

This is the entire architecture of the Lenet-5 model. The number of

To summarize… The network has

 5 layers with learnable parameters.

LSVRC (Large Scale Visual Recognition Challenge) is competition where

The Alex Net contains 8 layers with weights; 5

Since Alexnet is a deep architecture, the authors introduced padding to

output= ((Input-filter size)/ stride)+1

227x227x3 55x55x96 27x27x96 27x 27x256 13x13x256 13x13x384

13x13x384 13x13x256 6x6x256

To summarize the architecture

It has 8 layers with learnable parameters.

 The input to the Model is RGB images.

VGG16 is a convolutional neural network model proposed by K. Simonyan and A.

→ 2 x convolution layer of 64 channel of 3x3 kernal and same padding

→ 1 x maxpool layer of 2x2 pool size and stride 2x2

→ 2 x convolution layer of 128 channel of 3x3 kernal and same padding

→ 1 x maxpool layer of 2x2 pool size and stride 2x2

→ 3 x convolution layer of 256 channel of 3x3 kernal and same padding

→ 1 x maxpool layer of 2x2 pool size and stride 2x2

→ 3 x convolution layer of 512 channel of 3x3 kernal and same padding

→ 1 x maxpool layer of 2x2 pool size and stride 2x2

→ 3 x convolution layer of 512 channel of 3x3 kernal and same padding

→ 1 x maxpool layer of 2x2 pool size and stride 2x2

→ 1 x Dense layer of 4096 units, relu activation

→ 1 x Dense layer of 4096 units, relu activation

→ 1 x Dense Softmax layer of 2 units

Conv 3x3 filter,s=1,p=1

Maxpool window 2x2 s=2

1.Convolution using 64 filters

You might also like