0% found this document useful (0 votes)

32 views10 pages

CNN Notes

This document discusses CNN models and their components. A CNN consists of feature extraction layers like convolution and pooling layers, as well as classification layers like fully connected layers. Convolution layers automatically learn features from images like edges, shapes and objects. Pooling layers reduce the dimensionality of feature maps. RELU layers introduce non-linearity while flattening layers convert feature maps to vectors for fully connected layers to perform classification.

Uploaded by

Christeena Antony

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

32 views10 pages

CNN Notes

Uploaded by

Christeena Antony

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 10

A CNN model can be thought as a combination of two components:

feature extraction part and the classification part. The convolution +

pooling layers perform feature extraction. For example given an
image, the convolution layer detects features such as two eyes, long
ears, four legs, a short tail and so on. The fully connected layers then
act as a classifier on top of these features, and assign a probability
for the input image being a dog.

The convolution layers are the main powerhouse of a CNN model.

Automatically detecting meaningful features given only an image
and a label is not an easy task. The convolution layers learn such
complex features by building on top of each other. The first layers
detect edges, the next layers combine them to detect shapes, to
following layers merge this information to infer that this is a nose.
To be clear, the CNN doesn’t know what a nose is. By seeing a lot of
them in images, it learns to detect that as a feature. The fully
connected layers learn how to use these features produced by
convolutions in order to correctly classify the images.

Why do we prefer Convolutional Neural networks

(CNN) over Artificial Neural networks (ANN) for image
data as input?

1. Feedforward neural networks can learn a single feature representation of the

image but in the case of complex images, ANN will fail to give better predictions, this

is because it cannot learn pixel dependencies present in the images.

2. CNN can learn multiple layers of feature representations of an image by

applying filters, or transformations.

3. In CNN, the number of parameters for the network to learn is

significantly lower than the multilayer neural networks since the number of
units in the network decreases, therefore reducing the chance of overfitting.
4. Also, CNN considers the context information in the small neighborhood
and due to this feature, these are very important to achieve a better
prediction in data like images. Since digital images are a bunch of pixels with
high values, it makes sense to use CNN to analyze them. CNN decreases
their values, which is better for the training phase with less computational
power and less information loss.

Explain the different layers in CNN.

The different layers involved in the architecture of CNN are as follows:

1. Input Layer: The input layer in CNN should contain image data. Image
data is represented by a three-dimensional matrix. We have to reshape the
image into a single column.

For Example, Suppose we have an MNIST dataset and you have an image of

dimension 28 x 28 =784, you need to convert it into 784 x 1 before feeding
it into the input. If we have “k” training examples in the dataset, then the
dimension of input will be (784, k).

2. Convolutional Layer: To perform the convolution operation, this layer is

used which creates several smaller picture windows to go over the data.
3. ReLU Layer: This layer introduces the non-linearity to the network and
converts all the negative pixels to zero. The final output is a rectified feature
map.

4. Pooling Layer: Pooling is a down-sampling operation that reduces the

dimensionality of the feature map.

5. Fully Connected Layer: This layer identifies and classifies the objects in

the image.

6. Softmax / Logistic Layer: The softmax or Logistic layer is the last layer of

CNN. It resides at the end of the FC layer. Logistic is used for binary
classification problem statement and softmax is for multi-classification
problem statement.

7. Output Layer: This layer contains the label in the form of a one-hot

encoded vector.

Explain the significance of the RELU Activation function

in Convolution Neural Network.

RELU Layer – After each convolution operation, the RELU operation is

used. Moreover, RELU is a non-linear activation function. This operation is
applied to each pixel and replaces all the negative pixel values in the feature
map with zero.

Usually, the image is highly non-linear, which means varied pixel values. This
is a scenario that is very difficult for an algorithm to make correct
predictions. RELU activation function is applied in these cases to decrease
the non-linearity and make the job easier.
Therefore this layer helps in the detection of features, decreasing the non-
linearity of the image, converting negative pixels to zero which also allows
detecting the variations of features.

Therefore non-linearity in convolution(a linear operation) is introduced by

using a non-linear activation function like RELU.

Why do we use a Pooling Layer in a CNN?

CNN uses pooling layers to reduce the size of the input image so that it
speeds up the computation of the network.

Pooling or spatial pooling layers: Also called subsampling or downsampling.

 It is applied after convolution and RELU operations.

 It reduces the dimensionality of each feature map by retaining the most
important information.
 Since the number of hidden layers required to learn the complex relations
present in the image would be large.

As a result of pooling, even if the picture were a little tilted, the largest
number in a certain region of the feature map would have been recorded
and hence, the feature would have been preserved. Also as another benefit,
reducing the size by a very significant amount will use less computational
power. So, it is also useful for extracting dominant features.

Explain the role of the flattening layer in CNN.

After a series of convolution and pooling operations on the feature

representation of the image, we then flatten the output of the final pooling
layers into a single long continuous linear array or a vector.

The process of converting all the resultant 2-d arrays into a vector is
called Flattening.
Flatten output is fed as input to the fully connected neural network having
varying numbers of hidden layers to learn the non-linear complexities
present with the feature representation.

hyperparameters of a Pooling Layer.

The hyperparameters for a pooling layer are:

 Filter size
 Stride
 Max or average pooling

What is the role of the Fully Connected (FC) Layer in

CNN?

The aim of the Fully connected layer is to use the high-level feature of the
input image produced by convolutional and pooling layers for classifying the
input image into various classes based on the training dataset.

Fully connected means that every neuron in the previous layer is connected
to each and every neuron in the next layer. The Sum of output probabilities
from the Fully connected layer is 1, fully connected using a softmax
activation function in the output layer.

The softmax function takes a vector of arbitrary real-valued scores and

transforms it into a vector of values between 0 and 1 that sums to 1.

Working

It works like an ANN, assigning random weights to each synapse, the input
layer is weight-adjusted and put into an activation function. The output of
this is then compared to the true values and the error generated is back-
propagated, i.e. the weights are re-calculated and repeat all the processes.
This is done until the error or cost function is minimized.

Briefly explain the two major steps of CNN i.e, Feature

Learning and Classification.

Feature Learning deals with the algorithm by learning about the dataset.
Components like Convolution, ReLU, and Pooling work for that, with
numerous iterations between them. Once the features are known, then
classification happens using the Flattening and Full Connection
components.

VGG is a convolutional neural network from researchers at Oxford’s Visual

Geometry Group, hence the name VGG. It was the runner up of the
ImageNet classification challenge with 7.3% error rate. ImageNet is the
most comprehensive hand-annotated visual dataset, and they hold
competitions every year where researchers from all around the world
compete. All the famous CNN architectures make their debut at that
competition.
VGG is a very fundamental CNN model. It’s the first one that comes
to mind if you need to use an off-the-shelf model for a particular
task. The paper is also very well written, available here. There are
much more complicated models which perform better, for example
Microsoft’s ResNet model was the winner of 2015 ImageNet
challenge with 3.6% error rate, but the model has 152 layers! Details
available in the paper here. We will cover all these CNN
architectures in depth in another article, but if you want to jump
ahead here is a great post.

We will visualize the 3 most crucial components of the VGG model:

 Feature maps
 Convnet filters
 Class output
We will visualize the feature maps to see how the input is
transformed passing through the convolution layers. The feature
maps are also called intermediate activations since the output of a
layer is called the activation.

Remember that the output of a convolution layer is a 3D volume. As

we discussed above the height and width correspond to the
dimensions of the feature map, and each depth channel is a distinct
feature map encoding independent features. So we will visualize
individual feature maps by plotting each channel as a 2D image.

How to visualize the feature maps is actually pretty simple. We pass

an input image through the CNN and record the intermediate
activations. We then randomly select some of the feature maps and
plot them.

VGG convolutional layers are named as follows: blockX_convY. For

example the second filter in the third convolution block is called
block3_conv2. In the architecture diagram above it corresponds to
the second purple filter.

For example one of the feature maps from the output of the very first
layer (block1_conv1) looks as follows.
Bright areas are the “activated” regions, meaning the filter detected
the pattern it was looking for. This filter seems to encode an eye and
nose detector.

The following figure displays 8 feature maps per layer. Block1_conv1

actually contains 64 feature maps, since we have 64 filters in that
layer. But we are only visualizing the first 8 per layer in this figure.
 The first layer feature maps (block1_conv1) retain most of the
information present in the image. In CNN architectures the first
layers usually act as edge detectors.
 As we go deeper into the network, the feature maps look less like
the original image and more like an abstract representation of it.
As you can see in block3_conv1 the cat is somewhat visible, but
after that it becomes unrecognizable. The reason is that deeper
feature maps encode high level concepts like “cat nose” or “dog
ear” while lower level feature maps detect simple edges and
shapes. That’s why deeper feature maps contain less information
about the image and more about the class of the image. They still
encode useful features, but they are less visually interpretable by
us.
 The feature maps become sparser as we go deeper, meaning the
filters detect less features. It makes sense because the filters in
the first layers detect simple shapes, and every image contains
those. But as we go deeper we start looking for more complex
stuff like “dog tail” and they don’t appear in every image. That’s
why in the first figure with 8 filters per layer, we see more of the
feature maps as blank as we go deeper (block4_conv1 and
block5_conv1).

Convolution Neural Network
No ratings yet
Convolution Neural Network
74 pages
Convolutional Neural Networks
No ratings yet
Convolutional Neural Networks
5 pages
What is CNN
No ratings yet
What is CNN
2 pages
Unit 2 QUESTIONS and ANSWERS
No ratings yet
Unit 2 QUESTIONS and ANSWERS
26 pages
Nria20-Dl - Unit-3 Notes-Final
No ratings yet
Nria20-Dl - Unit-3 Notes-Final
23 pages
DL_MOD3
No ratings yet
DL_MOD3
102 pages
Cnn
No ratings yet
Cnn
9 pages
DL_U4
No ratings yet
DL_U4
7 pages
Introduction to Convolutional Neural Networks
No ratings yet
Introduction to Convolutional Neural Networks
4 pages
DL UNIT 3
No ratings yet
DL UNIT 3
27 pages
MODULE_05_CNN_ARCTITECTURE
No ratings yet
MODULE_05_CNN_ARCTITECTURE
7 pages
Unit III
No ratings yet
Unit III
8 pages
Summary Notes of Cnn
No ratings yet
Summary Notes of Cnn
23 pages
CNN notes unit-3
No ratings yet
CNN notes unit-3
12 pages
UNIT-III DeepLearning Notes
No ratings yet
UNIT-III DeepLearning Notes
30 pages
IBM Question & Answers
No ratings yet
IBM Question & Answers
3 pages
CV Unit V
No ratings yet
CV Unit V
18 pages
UNIT III DEEP LEARNING
No ratings yet
UNIT III DEEP LEARNING
31 pages
Project Exhibition 2
No ratings yet
Project Exhibition 2
42 pages
Module 3 Notes
No ratings yet
Module 3 Notes
22 pages
UNIT 2 Self Notes
No ratings yet
UNIT 2 Self Notes
10 pages
Typical CNN (Convolutional Neural Network) Architecture: CHARAN S (1VE20CA005) Cse-Ai, Svce
No ratings yet
Typical CNN (Convolutional Neural Network) Architecture: CHARAN S (1VE20CA005) Cse-Ai, Svce
13 pages
Introduction to Convolution Neural Network
No ratings yet
Introduction to Convolution Neural Network
15 pages
unit-3-CNN-2024
No ratings yet
unit-3-CNN-2024
58 pages
5 Layers of A Convolutional Neural Network
No ratings yet
5 Layers of A Convolutional Neural Network
15 pages
Unit III
No ratings yet
Unit III
60 pages
Unit 3
No ratings yet
Unit 3
19 pages
Deep Learning Unit-III
No ratings yet
Deep Learning Unit-III
9 pages
Introduction To Convolution Neural Network
No ratings yet
Introduction To Convolution Neural Network
6 pages
Unit-3
No ratings yet
Unit-3
59 pages
Convolutional Neural Networks
No ratings yet
Convolutional Neural Networks
8 pages
Convolution Neural Network
No ratings yet
Convolution Neural Network
12 pages
AD3501-DL-UNIT 2 NOTES
No ratings yet
AD3501-DL-UNIT 2 NOTES
29 pages
20 Questions To Test Your Skills On CNN Convolutional Neural Networks
No ratings yet
20 Questions To Test Your Skills On CNN Convolutional Neural Networks
11 pages
Unit3 2023 NNDL
No ratings yet
Unit3 2023 NNDL
69 pages
CNN Model Introduction and Overview
No ratings yet
CNN Model Introduction and Overview
2 pages
What Is Convolutional Neural Network
No ratings yet
What Is Convolutional Neural Network
16 pages
Deep LearningUNIT-IV
No ratings yet
Deep LearningUNIT-IV
16 pages
Intro to CNN
No ratings yet
Intro to CNN
93 pages
DL U4
No ratings yet
DL U4
59 pages
DL Unit3
No ratings yet
DL Unit3
8 pages
Lab 09
No ratings yet
Lab 09
5 pages
new
No ratings yet
new
8 pages
What Should You Consider or Pay Attention To When Preparing A Data Set
No ratings yet
What Should You Consider or Pay Attention To When Preparing A Data Set
7 pages
Convolutional Neural Network
100% (1)
Convolutional Neural Network
3 pages
Convolutional Neural Network (CNN) : Assignment On
No ratings yet
Convolutional Neural Network (CNN) : Assignment On
8 pages
Cv Ppt Mt101
No ratings yet
Cv Ppt Mt101
16 pages
Structure of Convolutional Neural Networks - Deep Learning
No ratings yet
Structure of Convolutional Neural Networks - Deep Learning
12 pages
Assignment 5_ _Implementing Image Classification using Deep Learning
No ratings yet
Assignment 5_ _Implementing Image Classification using Deep Learning
8 pages
Unit III
No ratings yet
Unit III
89 pages
Convolutional Neural Network
No ratings yet
Convolutional Neural Network
61 pages
Cnn Remake
No ratings yet
Cnn Remake
1 page
CNN
No ratings yet
CNN
3 pages
CNN Interview Question
No ratings yet
CNN Interview Question
16 pages
Ch-3 Convolutional Neural Networks (CNNs)
No ratings yet
Ch-3 Convolutional Neural Networks (CNNs)
11 pages
DLT Unit - 4
No ratings yet
DLT Unit - 4
36 pages
Unit IV Deep Leraning
No ratings yet
Unit IV Deep Leraning
35 pages
UNIT -4 DL
No ratings yet
UNIT -4 DL
19 pages
Topic 3ii - Convolutional Neural Network
No ratings yet
Topic 3ii - Convolutional Neural Network
43 pages
Communication Lab Record
No ratings yet
Communication Lab Record
32 pages
Module 3
No ratings yet
Module 3
46 pages
Mod 2.3 - Activation Function
No ratings yet
Mod 2.3 - Activation Function
9 pages
Mod 2.4,2.5,2.6 Architecture Design
No ratings yet
Mod 2.4,2.5,2.6 Architecture Design
20 pages
Mod 2.3 - Activation Function, Loss Functions
No ratings yet
Mod 2.3 - Activation Function, Loss Functions
12 pages

CNN Notes

Uploaded by

CNN Notes

Uploaded by

A CNN model can be thought as a combination of two components:

feature extraction part and the classification part. The convolution +

The convolution layers are the main powerhouse of a CNN model.

Why do we prefer Convolutional Neural networks

1. Feedforward neural networks can learn a single feature representation of the

is because it cannot learn pixel dependencies present in the images.

2. CNN can learn multiple layers of feature representations of an image by

3. In CNN, the number of parameters for the network to learn is

Explain the different layers in CNN.

The different layers involved in the architecture of CNN are as follows:

For Example, Suppose we have an MNIST dataset and you have an image of

2. Convolutional Layer: To perform the convolution operation, this layer is

4. Pooling Layer: Pooling is a down-sampling operation that reduces the

5. Fully Connected Layer: This layer identifies and classifies the objects in

6. Softmax / Logistic Layer: The softmax or Logistic layer is the last layer of

7. Output Layer: This layer contains the label in the form of a one-hot

Explain the significance of the RELU Activation function

RELU Layer – After each convolution operation, the RELU operation is

Therefore non-linearity in convolution(a linear operation) is introduced by

Why do we use a Pooling Layer in a CNN?

Pooling or spatial pooling layers: Also called subsampling or downsampling.

 It is applied after convolution and RELU operations.

Explain the role of the flattening layer in CNN.

After a series of convolution and pooling operations on the feature

hyperparameters of a Pooling Layer.

The hyperparameters for a pooling layer are:

What is the role of the Fully Connected (FC) Layer in

The softmax function takes a vector of arbitrary real-valued scores and

Briefly explain the two major steps of CNN i.e, Feature

VGG is a convolutional neural network from researchers at Oxford’s Visual

We will visualize the 3 most crucial components of the VGG model:

Remember that the output of a convolution layer is a 3D volume. As

How to visualize the feature maps is actually pretty simple. We pass

VGG convolutional layers are named as follows: blockX_convY. For

The following figure displays 8 feature maps per layer. Block1_conv1

You might also like