0% found this document useful (0 votes)
7 views11 pages

Convolution Neural Network

A Convolutional Neural Network (CNN) is a deep learning model primarily used for image processing tasks such as classification and detection. Key components include convolutional layers, pooling layers, and activation functions like ReLU and Softmax, which help in feature extraction and classification. CNNs utilize techniques like stride and padding to manage data dimensions and preserve important edge information.

Uploaded by

sumitdorle91
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views11 pages

Convolution Neural Network

A Convolutional Neural Network (CNN) is a deep learning model primarily used for image processing tasks such as classification and detection. Key components include convolutional layers, pooling layers, and activation functions like ReLU and Softmax, which help in feature extraction and classification. CNNs utilize techniques like stride and padding to manage data dimensions and preserve important edge information.

Uploaded by

sumitdorle91
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

UNIT 5 Convolution Neural Network

A Convolutional Neural Network (CNN) is a type of deep learning model


designed for processing structured grid data, particularly images. CNNs are
widely used in computer vision tasks such as image classification, object
detection, and segmentation.
Building Blocks of Convolutional Neural Networks (CNNs)

Stride:

In CNNs, stride refers to the step size with which the filter (kernel) moves
across the input image or feature map during convolution or pooling
operations.

→ Stride controls how much the filter shifts at each step.


Padding:

Padding refers to adding extra pixels (usually zeros) around the border of the
input image or feature map before applying convolution or pooling operations.

Why Padding is Required?

1. To Control Output Size

Without padding, the size of the output feature map decreases after every
convolution operation.​
→ Padding helps preserve the original input size.

2. To Handle Edge Information

Filters may ignore edge or corner pixels without padding.​


→ Padding ensures edge features are equally learned.
Pooling

Pooling is a down-sampling (or sub-sampling) operation used in CNNs to


reduce the spatial dimensions (Width × Height) of feature maps while retaining
the most important information.

Pooling helps to:

●​ Reduce computation​

●​ Reduce memory usage​

●​ Control overfitting​

●​ Provide translation invariance (detect features irrespective of location


shift)

Types of Pooling

Max Pooling​
→ Selects the maximum value from the pooling window.​
→ Retains prominent or strong features and removes less important
information.​
→ Most commonly used pooling type in CNNs.​
Average Pooling​
→ Computes the average of all values within the pooling window.​
→ Provides a smooth representation of features by reducing noise and
details equally.​

Global Max Pooling​


→ Takes the maximum value from the entire feature map of each channel.​
→ Used to capture the most significant feature in the whole feature map.​

Global Average Pooling​


→ Calculates the average value from the entire feature map of each channel.​
→ Often used before the final classification layer to reduce dimensions.​

Min Pooling​
→ Selects the minimum value from the pooling window.​
→ Rarely used but helpful in defect detection or applications where the
smallest value is important.

Convolution Over Volume:

In Convolutional Neural Networks (CNNs), especially when processing


colored images or multi-dimensional data, convolution over volume means
applying convolution operations across all the depth (channels) of the input
data — not just over width and height.
Softmax:

→ Softmax is an activation function used in machine learning, especially in the


output layer of neural networks for multi-class classification.

→ It converts raw output scores (logits) from a neural network into a


probability distribution over multiple classes.
Properties of Softmax:

Flattening:
Flattening is the process of converting a multi-dimensional feature map
(output of convolution and pooling layers) into a one-dimensional vector.
Building Blocks of Convolutional Neural Networks (CNNs):

1. Input Layer
●​ Accepts raw image data.​

●​ Image shape → Width × Height × Channels​


(e.g., 224×224×3 for RGB image)

2. Convolutional Layer
●​ Core layer that extracts features like edges, textures, shapes.​

●​ It applies filters (kernels) across the image to create feature maps.

Operations in Convolutional Layer:

●​ Filter/Kernel → Small matrix (e.g., 3x3 or 5x5)​

●​ Stride → Step size of filter movement​

●​ Padding → To control feature map size (same/valid)

3. Activation Function (ReLU)


●​ Adds non-linearity to the CNN.​

●​ Removes negative values → output = max(0, x)​

Benefits:

●​ Faster training

4. Pooling Layer (Subsampling/Downsampling)


●​ Reduces the spatial size of feature maps.​

●​ Makes CNN computationally efficient.​

●​ Provides translation invariance.

Types of Pooling:

●​ Max Pooling → Maximum value from region​

●​ Average Pooling → Average of values​

●​ Global Pooling → Pool entire feature map into 1 value per channel​

5. Flattening Layer
●​ Converts 2D/3D feature maps into 1D vector.​

●​ Connects CNN layers to Fully Connected Layers.

6. Fully Connected Layer (Dense Layer)


●​ Traditional Neural Network layer.​

●​ Every neuron connected to every neuron of next layer.​

●​ Performs final classification based on extracted features.


7. Output Layer
●​ Gives final prediction.​

●​ Uses activation like:​

○​ Softmax → Multi-class classification​

○​ Sigmoid → Binary classification​

You might also like