Convolution Neural Network: CP - 6 Machine Learning M S Prasad
Convolution Neural Network: CP - 6 Machine Learning M S Prasad
CP -6
Machine Learning
M S Prasad
In a convolutional network (ConvNet), there are basically three
types of layers:
1.Convolution layer
Why Convolutions?
2.Pooling layer There are primarily two major advantages of using
3.Fully connected layer convolutional layers over using just fully connected layers:
1.Parameter sharing
2.Sparsity of connections
Convolution data set
54 para 2 bias
Stride: It is the number of spaces that we move to the right before reach the end of the
image, and the spaces we move to below before we reach the end of the image.
Stride 1 Stride2
•Input: n X n ;Padding: p
•Stride: s Filter size: f X f
•Output: [(n+2p-f)/s+1] X [(n+2p-f)/s+1]
Padding: Sometimes we want to take advantage of all the pixels in the image, so a padding just indicates how
many columns and rows of zeros we are going to add in the border of the image.
Padding
•Convolution gives
•Input: n X n
•Filter size: f X f
•Output: (n-f+1) X (n-f+1)
1.Every time we apply a convolutional operation, the size of the image shrinks
2.Pixels present in the corner of the image are used only a few number of times during convolution as
compared to the central pixels
Fully connected layer Neurons in this layer are fully connected to all activations
in the previous layers, as in regular neural networks. These are usually at the
end of the network, e.g. outputting the class probabilities.
Loss layer Often the last layer in the network that computes the objective of the
task, such as classification, by e.g. applying the softmax function
The convolution layer is that all spatial locations share
the same convolution kernel, which greatly reduces the number of parameters
needed for a convolution layer.
The combination of convolution kernels and deep and hierarchical structures are very effective in
learning good representations (features) from images for visual recognition tasks..
key concept in CNN (or more generally deep learning) is distributed representation. For example,
suppose our task is to recognize N different types of objects and a CNN extracts M features from any
input image. It is most likely that any one of the M features is useful for recognizing all N object
categories; and to recognize one object type requires the joint effort of all M features.
LeNet-5
•Parameters: 60k
•Layers flow: Conv -> Pool -> Conv -> Pool -> FC -> FC -> Output
•Activation functions: Sigmoid/tanh and ReLu
AlexNet
•Parameters: 60 million
•Activation function: ReLu
INCEPTION NETWORK
Residual Blocks
The general flow to calculate activations from different layers can be given as: