Bird
Raised Fist0
PyTorchml~12 mins

Kernel size, stride, padding in PyTorch - Model Pipeline Trace

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Model Pipeline - Kernel size, stride, padding

This pipeline shows how an image passes through a convolutional layer in a neural network. It explains how kernel size, stride, and padding affect the image size and feature extraction.

Data Flow - 3 Stages
1Input Image
1 image x 1 channel x 5 height x 5 widthOriginal grayscale image1 image x 1 channel x 5 height x 5 width
[[1, 2, 3, 0, 1], [0, 1, 2, 3, 1], [1, 0, 1, 2, 2], [2, 1, 0, 1, 0], [1, 2, 1, 0, 1]]
2Apply Padding
1 x 1 x 5 x 5Add zero padding of 1 pixel around the image1 x 1 x 7 x 7
[[0,0,0,0,0,0,0], [0,1,2,3,0,1,0], [0,0,1,2,3,1,0], [0,1,0,1,2,2,0], [0,2,1,0,1,0,0], [0,1,2,1,0,1,0], [0,0,0,0,0,0,0]]
3Convolution with Kernel
1 x 1 x 7 x 7Apply 3x3 kernel with stride 21 x 1 x 3 x 3
Sliding 3x3 kernel over padded image with steps of 2 pixels
Training Trace - Epoch by Epoch
Loss
1.0 |****
0.8 |****
0.6 |****
0.4 |****
0.2 |****
0.0 +----
      1 2 3 4 5 Epochs
EpochLoss ↓Accuracy ↑Observation
10.850.45Initial training with high loss and low accuracy
20.650.60Loss decreased, accuracy improved
30.500.72Model learning features well
40.400.80Good convergence, loss decreasing steadily
50.350.85Training stabilizing with high accuracy
Prediction Trace - 3 Layers
Layer 1: Input Image
Layer 2: Padding
Layer 3: Convolution with 3x3 kernel, stride 2
Model Quiz - 3 Questions
Test your understanding
What does increasing the stride in a convolution do to the output size?
AKeeps output size the same
BIncreases output size
CDecreases output size
DRemoves padding
Key Insight
Kernel size controls the area the model looks at once. Stride controls how much the kernel moves each step, affecting output size. Padding helps keep output size larger by adding borders. Together, they shape how the model extracts features from images.

Practice

(1/5)
1. What does the stride parameter control in a convolutional layer in PyTorch?
easy
A. How far the filter moves on the input each step
B. The size of the filter scanning the input
C. The number of filters used in the layer
D. The amount of zero padding added around the input

Solution

  1. Step 1: Understand stride in convolution

    Stride defines the step size the filter moves when scanning the input image or feature map.
  2. Step 2: Differentiate stride from other parameters

    Kernel size is the filter size, padding adds pixels around input, and number of filters controls output depth.
  3. Final Answer:

    How far the filter moves on the input each step -> Option A
  4. Quick Check:

    Stride = step size of filter movement [OK]
Hint: Stride = filter step size, not size or padding [OK]
Common Mistakes:
  • Confusing stride with kernel size
  • Thinking stride controls padding
  • Mixing stride with number of filters
2. Which of the following is the correct way to create a 2D convolutional layer in PyTorch with kernel size 3, stride 2, and padding 1?
easy
A. nn.Conv2d(in_channels=1, out_channels=10, kernel_size=3, stride=2, padding=1)
B. nn.Conv2d(1, 10, kernel=3, stride=2, pad=1)
C. nn.Conv2d(1, 10, kernel_size=3, stride=1, padding=2)
D. nn.Conv2d(in_channels=1, out_channels=10, kernel_size=2, stride=2, padding=1)

Solution

  1. Step 1: Check PyTorch Conv2d parameter names

    Correct parameters are in_channels, out_channels, kernel_size, stride, and padding.
  2. Step 2: Match values to question

    Kernel size=3, stride=2, padding=1 matches only nn.Conv2d(in_channels=1, out_channels=10, kernel_size=3, stride=2, padding=1) exactly.
  3. Final Answer:

    nn.Conv2d(in_channels=1, out_channels=10, kernel_size=3, stride=2, padding=1) -> Option A
  4. Quick Check:

    Correct parameter names and values used [OK]
Hint: Use exact PyTorch parameter names: kernel_size, stride, padding [OK]
Common Mistakes:
  • Using wrong parameter names like kernel or pad
  • Mixing stride and padding values
  • Wrong kernel size or stride values
3. Given an input tensor of size (1, 1, 7, 7), a Conv2d layer with kernel_size=3, stride=2, and padding=1 is applied. What is the output spatial size (height and width)?
medium
A. (1, 1, 3, 3)
B. (1, 1, 5, 5)
C. (1, 1, 2, 2)
D. (1, 1, 4, 4)

Solution

  1. Step 1: Use output size formula for Conv2d

    Output size = floor((Input + 2*padding - kernel_size)/stride) + 1
  2. Step 2: Calculate output height and width

    Input=7, padding=1, kernel=3, stride=2
    Output = floor((7 + 2*1 - 3)/2) + 1 = floor((7 + 2 - 3)/2) + 1 = floor(6/2) + 1 = 3 + 1 = 4
  3. Final Answer:

    (1, 1, 4, 4) -> Option D
  4. Quick Check:

    Output size formula applied correctly [OK]
Hint: Apply formula: floor((I+2P-K)/S)+1 for each dimension [OK]
Common Mistakes:
  • Forgetting to add padding twice
  • Using ceil instead of floor
  • Mixing stride and kernel size in formula
4. You wrote this PyTorch code but get an error:
nn.Conv2d(1, 10, kernel_size=3, stride=2, padding=0)
on input size (1, 1, 1, 1). What is the likely cause?
medium
A. Kernel size must be even number
B. Padding is too small for the input size causing negative output dimension
C. Stride value must be 1 for kernel size 3
D. Input channels and output channels are swapped

Solution

  1. Step 1: Check output size with given parameters

    Output size = floor((1 + 2*0 - 3)/2) + 1 = floor((-2)/2) + 1 = floor(-1) + 1 = -1 + 1 = 0 (invalid)
  2. Step 2: Consider if padding causes error

    Padding=0 on small input (1x1) causes the calculated output size to be zero, which PyTorch raises as a runtime error due to insufficient padding for the kernel size.
  3. Final Answer:

    Padding is too small for the input size causing negative output dimension -> Option B
  4. Quick Check:

    Padding too small -> invalid output size [OK]
Hint: Check if padding is too small for input size [OK]
Common Mistakes:
  • Assuming stride must be 1 for kernel 3
  • Thinking kernel size must be even
  • Swapping input and output channels
5. You want to keep the output size the same as the input size (7x7) after a Conv2d layer with kernel_size=5 and stride=1. What padding value should you use?
hard
A. 1
B. 0
C. 2
D. 3

Solution

  1. Step 1: Use formula for output size with stride=1

    Output size = Input size if padding = (kernel_size - 1) / 2
  2. Step 2: Calculate padding

    Padding = (5 - 1) / 2 = 4 / 2 = 2
  3. Final Answer:

    2 -> Option C
  4. Quick Check:

    Padding = (kernel_size - 1)/2 keeps size same [OK]
Hint: Padding = (kernel_size - 1) / 2 for same size with stride 1 [OK]
Common Mistakes:
  • Using padding 0 or 1 incorrectly
  • Forgetting stride must be 1 for same size
  • Using padding larger than needed