0% found this document useful (0 votes)
72 views28 pages

Convolution Neural Network: CP - 6 Machine Learning M S Prasad

1. Convolutional neural networks use three main types of layers: convolution layers, pooling layers, and fully connected layers. Convolution layers use small filters that are convolved across input volumes to extract features. Pooling layers perform downsampling to reduce computation. Fully connected layers are similar to regular neural networks. 2. Convolution layers use parameter sharing and sparse connections which reduces the number of parameters compared to fully connected networks. 3. The document discusses concepts like padding, stride, filters, bias, normalization and different network architectures such as LeNet-5, AlexNet, Inception and Residual networks.

Uploaded by

M S Prasad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
72 views28 pages

Convolution Neural Network: CP - 6 Machine Learning M S Prasad

1. Convolutional neural networks use three main types of layers: convolution layers, pooling layers, and fully connected layers. Convolution layers use small filters that are convolved across input volumes to extract features. Pooling layers perform downsampling to reduce computation. Fully connected layers are similar to regular neural networks. 2. Convolution layers use parameter sharing and sparse connections which reduces the number of parameters compared to fully connected networks. 3. The document discusses concepts like padding, stride, filters, bias, normalization and different network architectures such as LeNet-5, AlexNet, Inception and Residual networks.

Uploaded by

M S Prasad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 28

Convolution Neural Network

CP -6
Machine Learning
M S Prasad
In a convolutional network (ConvNet), there are basically three
types of layers:
1.Convolution layer
Why Convolutions?
2.Pooling layer There are primarily two major advantages of using
3.Fully connected layer convolutional layers over using just fully connected layers:
1.Parameter sharing
2.Sparsity of connections
Convolution data set

54 para 2 bias
Stride: It is the number of spaces that we move to the right before reach the end of the
image, and the spaces we move to below before we reach the end of the image.
Stride 1 Stride2

•Input: n X n ;Padding: p
•Stride: s Filter size: f X f
•Output: [(n+2p-f)/s+1] X [(n+2p-f)/s+1]

Padding: Sometimes we want to take advantage of all the pixels in the image, so a padding just indicates how
many columns and rows of zeros we are going to add in the border of the image.
Padding

•Convolution gives
•Input: n X n
•Filter size: f X f
•Output: (n-f+1) X (n-f+1)
1.Every time we apply a convolutional operation, the size of the image shrinks
2.Pixels present in the corner of the image are used only a few number of times during convolution as
compared to the central pixels

This is where padding comes to the fore:


Input: n X n
Padding: p
Filter size: f X f
Output: (n+2p-f+1) X (n+2p-f+1)
There are two common choices for padding:
Valid: It means no padding. If we are using valid padding, the output will be (n-f+1) X (n-f+1)
Same: Here, we apply padding so that the output size is the same as the input size, i.e.,
n+2p-f+1 = n
So, p = (f-1)/2
One Layer of a Convolutional Network

z[1] = w[1]*a[0] + b[1]


a[1] = g(z[1])

Suppose we have 10 filters, each of shape 3 X 3 X 3. What will be the number of


parameters in that layer? Let’s try to solve this:
•Number of parameters for each filter = 3*3*3 = 27
•There will be a bias term for each filter, so total parameters per filter = 28
•As there are 10 filters, the total parameters for that layer = 28*10 = 280
Generalized dimensions can be given as:
•Input: n X n X nc
•Filter: f X f X nc
•Padding: p
•Stride: s
•Output: [(n+2p-f)/s+1] X [(n+2p-f)/s+1] X nc’
Here, nc is the number of channels in the input and filter, while nc’ is the number of filters.
 
Pooling layer
Non-linear down sampling of the volume by using small filters to sample for example the
maximum or average values in a rectangular area of the output from the previous layer. Pooling
reduces the spatial size, to reduce the amount of parameters and computations, and additionally
avoids overfitting, i.e. high training accuracy but low validation accuracy.

nh X nw X nc, then the output will be [{(nh – f) / s + 1} X {(nw – f) / s + 1} X


nc].
BIAS
The relation between the bias and the result of a convolution.
The bias add some specific value to the result in every channel, so for the error that we receive from
an upper layer every value of the bias needs to change according to the error of the related
channel.
Normalisation layer Different kinds of normalisation layers have been proposed
to normalise the data, but have not proven useful in practice and have therefore
not gained any solid ground.

Fully connected layer Neurons in this layer are fully connected to all activations
in the previous layers, as in regular neural networks. These are usually at the
end of the network, e.g. outputting the class probabilities.
Loss layer Often the last layer in the network that computes the objective of the
task, such as classification, by e.g. applying the softmax function
The convolution layer is that all spatial locations share
the same convolution kernel, which greatly reduces the number of parameters
needed for a convolution layer.

In a deep neural network setup, convolution also encourages parameter sharing.

The combination of convolution kernels and deep and hierarchical structures are very effective in
learning good representations (features) from images for visual recognition tasks..

key concept in CNN (or more generally deep learning) is distributed representation. For example,
suppose our task is to recognize N different types of objects and a CNN extracts M features from any
input image. It is most likely that any one of the M features is useful for recognizing all N object
categories; and to recognize one object type requires the joint effort of all M features.
LeNet-5

•Parameters: 60k
•Layers flow: Conv -> Pool -> Conv -> Pool -> FC -> FC -> Output
•Activation functions: Sigmoid/tanh and ReLu
AlexNet

•Parameters: 60 million
•Activation function: ReLu
INCEPTION NETWORK
Residual Blocks
The general flow to calculate activations from different layers can be given as:

The benefit of training a residual network is that even


if we train deeper networks, the training error does
not increase

You might also like