0
0
PytorchHow-ToBeginner · 3 min read

How to Use nn.MaxPool2d in PyTorch: Syntax and Example

Use nn.MaxPool2d in PyTorch to apply max pooling on 2D inputs like images. Initialize it with parameters like kernel_size to define the pooling window, then apply it to your tensor to reduce spatial size by taking the maximum value in each window.
📐

Syntax

The nn.MaxPool2d layer is created by specifying the kernel_size, which is the size of the window to take the maximum over. Optional parameters include stride (step size of the window), padding (zero-padding added to the input), and dilation (spacing between kernel elements).

Typical usage:

  • kernel_size: int or tuple, size of the pooling window
  • stride: int or tuple, how far the window moves each step (defaults to kernel_size)
  • padding: int or tuple, adds zeros around input edges
  • dilation: int or tuple, spacing between kernel elements (usually 1)
python
import torch.nn as nn

maxpool = nn.MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1)
💻

Example

This example shows how to create a max pooling layer with a 2x2 window and stride 2, then apply it to a random 4x4 input tensor with one channel and batch size 1. The output tensor has reduced spatial size because max pooling picks the largest value in each 2x2 block.

python
import torch
import torch.nn as nn

# Create a tensor with shape (batch_size=1, channels=1, height=4, width=4)
input_tensor = torch.tensor([[[[1.0, 2.0, 3.0, 4.0],
                               [5.0, 6.0, 7.0, 8.0],
                               [9.0, 10.0, 11.0, 12.0],
                               [13.0, 14.0, 15.0, 16.0]]]])

# Define max pooling layer
maxpool = nn.MaxPool2d(kernel_size=2, stride=2)

# Apply max pooling
output = maxpool(input_tensor)

print("Input Tensor:\n", input_tensor)
print("\nOutput Tensor after MaxPool2d:\n", output)
Output
Input Tensor: tensor([[[[ 1., 2., 3., 4.], [ 5., 6., 7., 8.], [ 9., 10., 11., 12.], [13., 14., 15., 16.]]]]) Output Tensor after MaxPool2d: tensor([[[[ 6., 8.], [14., 16.]]]])
⚠️

Common Pitfalls

Common mistakes when using nn.MaxPool2d include:

  • Not setting stride, which defaults to kernel_size. This can cause unexpected output sizes if you want overlapping windows.
  • Using padding incorrectly, which can add zeros and affect max values.
  • Feeding input tensors with wrong shape. The input must be 4D: (batch_size, channels, height, width).

Example of a wrong input shape and the fix:

python
# Wrong: input is 3D (missing batch dimension)
import torch
import torch.nn as nn

input_wrong = torch.randn(1, 4, 4)  # shape (channels, height, width)
maxpool = nn.MaxPool2d(2)

try:
    output_wrong = maxpool(input_wrong)
except Exception as e:
    print(f"Error: {e}")

# Right: add batch dimension
input_right = input_wrong.unsqueeze(0)  # shape (1, channels, height, width)
output_right = maxpool(input_right)
print("Output shape with correct input:", output_right.shape)
Output
Error: Expected 4-dimensional input for 4-dimensional weight [1, 1, 2, 2], but got 3-dimensional input of size [1, 4, 4] instead Output shape with correct input: torch.Size([1, 1, 2, 2])
📊

Quick Reference

ParameterDescriptionDefault
kernel_sizeSize of the window to take max overRequired
strideStep size of the window movementkernel_size
paddingZero-padding added to input edges0
dilationSpacing between kernel elements1
return_indicesIf True, returns max indices for unpoolingFalse
ceil_modeIf True, use ceil instead of floor to compute output shapeFalse

Key Takeaways

nn.MaxPool2d reduces spatial size by taking max values in sliding windows over 2D inputs.
Set kernel_size to define the pooling window and stride to control step size; stride defaults to kernel_size.
Input to MaxPool2d must be 4D: (batch_size, channels, height, width).
Padding affects output size and values; use carefully to avoid unexpected results.
Use return_indices=True if you plan to reverse pooling with MaxUnpool2d.