CNN | Introduction to Pooling Layer

Last Updated : 02 Apr, 2025

Pooling layer is used in CNNs to reduce the spatial dimensions (width and height) of the input feature maps while retaining the most important information. It involves sliding a two-dimensional filter over each channel of a feature map and summarizing the features within the region covered by the filter.

For a feature map with dimensions n_h \times n_w \times n_c, the dimensions of the output after a pooling layer are:

\left( \frac{n_h - f + 1}{s} \right) \times \left( \frac{n_w - f + 1}{s} \right) \times n_c

where:

n_h → height of the feature map
n_w → width of the feature map
n_c → number of channels in the feature map
f → size of the pooling filter
s → stride length

A typical CNN model architecture consists of multiple convolution and pooling layers stacked together.

Why are Pooling Layers Important?

Dimensionality Reduction: Pooling layers reduce the spatial size of the feature maps, which decreases the number of parameters and computations in the network. This makes the model faster and more efficient.
Translation Invariance: Pooling helps the network become invariant to small translations or distortions in the input image. For example, even if an object in an image is slightly shifted, the pooled output will remain relatively unchanged.
Overfitting Prevention: By reducing the spatial dimensions, pooling layers help prevent overfitting by providing a form of regularization.
Feature Hierarchy: Pooling layers help build a hierarchical representation of features, where lower layers capture fine details and higher layers capture more abstract and global features.

Types of Pooling Layers

1. Max Pooling

Max pooling selects the maximum element from the region of the feature map covered by the filter. Thus, the output after max-pooling layer would be a feature map containing the most prominent features of the previous feature map.

Max pooling layer preserves the most important features (edges, textures, etc.) and provides better performance in most cases.

Max Pooling in Keras:

Python

from tensorflow.keras.layers import MaxPooling2D
import numpy as np

# Example input feature map
feature_map = np.array([
    [1, 3, 2, 9],
    [5, 6, 1, 7],
    [4, 2, 8, 6],
    [3, 5, 7, 2]
]).reshape(1, 4, 4, 1)

# Applying max pooling
max_pool = MaxPooling2D(pool_size=(2, 2), strides=2)
output = max_pool(feature_map)

print(output.numpy().reshape(2, 2))

Output:

[[6 9]
[5 8]]

2. Average Pooling

Average pooling computes the average of the elements present in the region of feature map covered by the filter. Thus, while max pooling gives the most prominent feature in a particular patch of the feature map, average pooling gives the average of features present in a patch.

Average pooling provides a more generalized representation of the input. It is useful in the cases where preserving the overall context is important.

Average Pooling using Keras:

Python

import numpy as np
import tensorflow as tf
from tensorflow.keras.layers import AveragePooling2D

feature_map = np.array([
    [1, 3, 2, 9],
    [5, 6, 1, 7],
    [4, 2, 8, 6],
    [3, 5, 7, 2]
], dtype=np.float32).reshape(1, 4, 4, 1)  # Convert to float32

# Applying average pooling
avg_pool = AveragePooling2D(pool_size=(2, 2), strides=2)
output = avg_pool(feature_map)
print(output.numpy().reshape(2, 2))

Output:

[[3.75 4.75]
[3.5 5.75]]

3. Global Pooling

Global pooling reduces each channel in the feature map to a single value, producing a 1 \times 1 \times n_c output. This is equivalent to applying a filter of size n_h × n_w.

There are two types of global pooling:

Global Max Pooling: Takes the maximum value across the entire feature map.
Global Average Pooling: Computes the average of all values in the feature map.

Global Pooling using Keras:

Python

from tensorflow.keras.layers import GlobalMaxPooling2D, GlobalAveragePooling2D

feature_map = np.array([
    [1, 3, 2, 9],
    [5, 6, 1, 7],
    [4, 2, 8, 6],
    [3, 5, 7, 2]
], dtype=np.float32).reshape(1, 4, 4, 1) 

# Applying global max pooling
gm_pool = GlobalMaxPooling2D()
gm_output = gm_pool(feature_map)

# Applying global average pooling
ga_pool = GlobalAveragePooling2D()
ga_output = ga_pool(feature_map)

print("Global Max Pooling Output:", gm_output.numpy())
print("Global Average Pooling Output:", ga_output.numpy())

Output:

Global Max Pooling Output: [[9]]
Global Average Pooling Output: [[4.4375]]

How Pooling Layers Work?

Define a Pooling Window (Filter): The size of the pooling window (e.g., 2x2) is chosen, along with a stride (the step size by which the window moves). A common choice is a 2x2 window with a stride of 2, which reduces the feature map size by half.
Slide the Window Over the Input: The pooling operation is applied to each region of the input feature map covered by the window.
Apply the Pooling Operation: Depending on the type of pooling (max, average, etc.), the operation extracts the required value from each window.
Output the Downsampled Feature Map: The result is a smaller feature map that retains the most important information.

Key Factors to Consider for Optimizing Pooling Layer

Pooling Window Size: The size of the pooling window affects the degree of downsampling. A larger window results in more aggressive downsampling but may lose important details.
Stride: The stride determines how much the pooling window moves at each step. A larger stride results in greater dimensionality reduction.
Padding: In some cases, padding is used to ensure that the pooling operation covers the entire input feature map.

Advantages of Pooling Layer

Dimensionality reduction: Pooling layer helps in reducing the spatial dimensions of the feature maps. This reduces the computational cost and also helps in avoiding overfitting by reducing the number of parameters in the model.
Translation invariance: Pooling layers are useful in achieving translation invariance in the feature maps. This means that the position of an object in the image does not affect the classification result, as the same features are detected regardless of the position of the object.
Feature selection: Pooling layers help in selecting the most important features from the input, as max pooling selects the most salient features and average pooling provides a balanced representation.

Disadvantages of Pooling Layers

Information Loss: Pooling reduces spatial resolution, which can lead to a loss of important fine details.
Over-smoothing: Excessive pooling may blur out crucial features.
Hyperparameter Tuning: The choice of pooling size and stride affects performance and requires careful tuning.

CIFAR-10 Image Classification in TensorFlow

savyakhosla

Improve

Article Tags :

CNN | Introduction to Pooling Layer

Why are Pooling Layers Important?

Types of Pooling Layers

1. Max Pooling

2. Average Pooling

3. Global Pooling

How Pooling Layers Work?

Key Factors to Consider for Optimizing Pooling Layer

Advantages of Pooling Layer

Disadvantages of Pooling Layers

Similar Reads

Introduction to Deep Learning

Basic Neural Network

Activation Functions

Artificial Neural Network

Classification

Regression

Hyperparameter tuning

Introduction to Convolution Neural Network

Recurrent Neural Network

Thank You!

What kind of Experience do you want to share?