How to use nn.MaxPool2d pytorch

PytorchHow-ToBeginner · 3 min read

How to Use nn.MaxPool2d in PyTorch: Syntax and Example

Use nn.MaxPool2d in PyTorch to apply max pooling on 2D inputs like images. Initialize it with parameters like kernel_size to define the pooling window, then apply it to your tensor to reduce spatial size by taking the maximum value in each window.

📐

Syntax

The nn.MaxPool2d layer is created by specifying the kernel_size, which is the size of the window to take the maximum over. Optional parameters include stride (step size of the window), padding (zero-padding added to the input), and dilation (spacing between kernel elements).

Typical usage:

kernel_size: int or tuple, size of the pooling window
stride: int or tuple, how far the window moves each step (defaults to kernel_size)
padding: int or tuple, adds zeros around input edges
dilation: int or tuple, spacing between kernel elements (usually 1)

python

import torch.nn as nn

maxpool = nn.MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1)

💻

Example

This example shows how to create a max pooling layer with a 2x2 window and stride 2, then apply it to a random 4x4 input tensor with one channel and batch size 1. The output tensor has reduced spatial size because max pooling picks the largest value in each 2x2 block.

python

import torch
import torch.nn as nn

# Create a tensor with shape (batch_size=1, channels=1, height=4, width=4)
input_tensor = torch.tensor([[[[1.0, 2.0, 3.0, 4.0],
                               [5.0, 6.0, 7.0, 8.0],
                               [9.0, 10.0, 11.0, 12.0],
                               [13.0, 14.0, 15.0, 16.0]]]])

# Define max pooling layer
maxpool = nn.MaxPool2d(kernel_size=2, stride=2)

# Apply max pooling
output = maxpool(input_tensor)

print("Input Tensor:\n", input_tensor)
print("\nOutput Tensor after MaxPool2d:\n", output)

Output

Input Tensor: tensor([[[[ 1., 2., 3., 4.], [ 5., 6., 7., 8.], [ 9., 10., 11., 12.], [13., 14., 15., 16.]]]]) Output Tensor after MaxPool2d: tensor([[[[ 6., 8.], [14., 16.]]]])

⚠️

Common Pitfalls

Common mistakes when using nn.MaxPool2d include:

Not setting stride, which defaults to kernel_size. This can cause unexpected output sizes if you want overlapping windows.
Using padding incorrectly, which can add zeros and affect max values.
Feeding input tensors with wrong shape. The input must be 4D: (batch_size, channels, height, width).

Example of a wrong input shape and the fix:

python

# Wrong: input is 3D (missing batch dimension)
import torch
import torch.nn as nn

input_wrong = torch.randn(1, 4, 4)  # shape (channels, height, width)
maxpool = nn.MaxPool2d(2)

try:
    output_wrong = maxpool(input_wrong)
except Exception as e:
    print(f"Error: {e}")

# Right: add batch dimension
input_right = input_wrong.unsqueeze(0)  # shape (1, channels, height, width)
output_right = maxpool(input_right)
print("Output shape with correct input:", output_right.shape)

Output

Error: Expected 4-dimensional input for 4-dimensional weight [1, 1, 2, 2], but got 3-dimensional input of size [1, 4, 4] instead Output shape with correct input: torch.Size([1, 1, 2, 2])

📊

Quick Reference

Parameter	Description	Default
kernel_size	Size of the window to take max over	Required
stride	Step size of the window movement	kernel_size
padding	Zero-padding added to input edges	0
dilation	Spacing between kernel elements	1
return_indices	If True, returns max indices for unpooling	False
ceil_mode	If True, use ceil instead of floor to compute output shape	False

✅

Key Takeaways

nn.MaxPool2d reduces spatial size by taking max values in sliding windows over 2D inputs.

Set kernel_size to define the pooling window and stride to control step size; stride defaults to kernel_size.

Input to MaxPool2d must be 4D: (batch_size, channels, height, width).

Padding affects output size and values; use carefully to avoid unexpected results.

Use return_indices=True if you plan to reverse pooling with MaxUnpool2d.