How to Use MaxPool2d in PyTorch: Syntax and Example
Use
torch.nn.MaxPool2d to apply 2D max pooling in PyTorch. Initialize it with parameters like kernel_size and then call it on your input tensor to reduce spatial dimensions by taking the maximum value in each window.Syntax
The torch.nn.MaxPool2d class creates a max pooling layer for 2D inputs like images. Key parameters include:
- kernel_size: Size of the window to take max over.
- stride: Step size to move the window (defaults to
kernel_sizeif not set). - padding: Zero-padding added to both sides of input.
- dilation: Controls spacing between kernel elements.
- return_indices: If True, returns max indices for unpooling.
After creating the layer, call it like a function on your input tensor.
python
import torch import torch.nn as nn maxpool = nn.MaxPool2d(kernel_size=2, stride=2, padding=0) output = maxpool(input_tensor) # input_tensor is a 4D tensor: (batch, channels, height, width)
Example
This example shows how to create a max pooling layer with a 2x2 window and stride 2, then apply it to a sample 4D tensor representing a batch of one 4x4 grayscale image.
python
import torch import torch.nn as nn # Create a sample input tensor: batch=1, channels=1, height=4, width=4 input_tensor = torch.tensor([[[[1.0, 2.0, 3.0, 4.0], [5.0, 6.0, 7.0, 8.0], [9.0, 10.0, 11.0, 12.0], [13.0, 14.0, 15.0, 16.0]]]]) # Define MaxPool2d layer maxpool = nn.MaxPool2d(kernel_size=2, stride=2) # Apply max pooling output = maxpool(input_tensor) print("Input Tensor:\n", input_tensor) print("\nOutput Tensor after MaxPool2d:\n", output)
Output
Input Tensor:
tensor([[[[ 1., 2., 3., 4.],
[ 5., 6., 7., 8.],
[ 9., 10., 11., 12.],
[13., 14., 15., 16.]]]])
Output Tensor after MaxPool2d:
tensor([[[[ 6., 8.],
[14., 16.]]]])
Common Pitfalls
Common mistakes when using MaxPool2d include:
- Not matching
stridetokernel_size, which can cause unexpected output sizes. - Forgetting that input must be 4D: (batch, channels, height, width).
- Using padding incorrectly, which can change output shape unexpectedly.
- Confusing max pooling with average pooling (
AvgPool2d).
Always check your input shape and output shape to ensure pooling behaves as expected.
python
import torch import torch.nn as nn input_tensor = torch.randn(1, 1, 5, 5) # 5x5 input # Wrong: stride smaller than kernel_size can cause overlapping windows maxpool_wrong = nn.MaxPool2d(kernel_size=3, stride=1) output_wrong = maxpool_wrong(input_tensor) # Right: stride equals kernel_size for non-overlapping windows maxpool_right = nn.MaxPool2d(kernel_size=3, stride=3) output_right = maxpool_right(input_tensor) print("Output shape with stride=1:", output_wrong.shape) print("Output shape with stride=3:", output_right.shape)
Output
Output shape with stride=1: torch.Size([1, 1, 3, 3])
Output shape with stride=3: torch.Size([1, 1, 1, 1])
Quick Reference
Remember these tips when using MaxPool2d:
- kernel_size: size of pooling window (required).
- stride: step size; defaults to
kernel_size. - padding: zero-padding around input.
- input shape: must be 4D tensor (batch, channels, height, width).
- Output size depends on input size, kernel, stride, and padding.
Key Takeaways
MaxPool2d reduces spatial size by taking max values in sliding windows over 2D inputs.
Set kernel_size and stride carefully to control output dimensions.
Input to MaxPool2d must be a 4D tensor: (batch, channels, height, width).
Padding affects output size and should be used when needed to preserve borders.
Check output shapes to avoid unexpected results from stride and kernel size mismatch.