Bird
Raised Fist0
PyTorchml~5 mins

Kernel size, stride, padding in PyTorch - Cheat Sheet & Quick Revision

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Recall & Review
beginner
What is kernel size in a convolutional layer?
Kernel size is the size of the small window (filter) that slides over the input image or feature map to detect patterns. For example, a 3x3 kernel looks at 3 pixels wide and 3 pixels tall at a time.
Click to reveal answer
beginner
Explain stride in convolution.
Stride is how many pixels the kernel moves each time it slides over the input. A stride of 1 moves the kernel one pixel at a time, while a stride of 2 skips one pixel, making the output smaller.
Click to reveal answer
beginner
What does padding do in convolutional layers?
Padding adds extra pixels (usually zeros) around the input edges. This helps keep the output size the same as the input or controls how much the output shrinks after convolution.
Click to reveal answer
intermediate
How does increasing stride affect the output size?
Increasing stride makes the kernel jump further each step, so the output feature map becomes smaller because fewer positions are covered.
Click to reveal answer
intermediate
Why might you use padding='same' in PyTorch convolution?
Padding='same' adds just enough padding so the output size matches the input size, which is useful when you want to keep spatial dimensions unchanged through layers.
Click to reveal answer
What does a kernel size of (5,5) mean in a convolution?
AThe filter looks at 5 pixels wide and 5 pixels tall at a time
BThe stride moves 5 pixels each step
CPadding adds 5 pixels around the input
DThe output size will be 5 times smaller
If stride=2, how does the output size change compared to stride=1?
AOutput size doubles
BOutput size becomes zero
COutput size halves approximately
DOutput size stays the same
What is the main purpose of padding in convolution?
ATo add extra pixels around input edges
BTo increase the number of channels
CTo reduce the kernel size
DTo speed up training
Which PyTorch parameter controls how far the kernel moves each step?
Akernel_size
Bdilation
Cpadding
Dstride
What happens if you use no padding with a 3x3 kernel and stride 1 on a 28x28 input?
AOutput size remains 28x28
BOutput size becomes 26x26
COutput size becomes 30x30
DOutput size becomes 1x1
Describe how kernel size, stride, and padding affect the output size of a convolutional layer.
Think about how the filter moves and how edges are handled.
You got /5 concepts.
    Explain why padding might be important when stacking many convolutional layers.
    Consider what happens to image size after many convolutions.
    You got /4 concepts.

      Practice

      (1/5)
      1. What does the stride parameter control in a convolutional layer in PyTorch?
      easy
      A. How far the filter moves on the input each step
      B. The size of the filter scanning the input
      C. The number of filters used in the layer
      D. The amount of zero padding added around the input

      Solution

      1. Step 1: Understand stride in convolution

        Stride defines the step size the filter moves when scanning the input image or feature map.
      2. Step 2: Differentiate stride from other parameters

        Kernel size is the filter size, padding adds pixels around input, and number of filters controls output depth.
      3. Final Answer:

        How far the filter moves on the input each step -> Option A
      4. Quick Check:

        Stride = step size of filter movement [OK]
      Hint: Stride = filter step size, not size or padding [OK]
      Common Mistakes:
      • Confusing stride with kernel size
      • Thinking stride controls padding
      • Mixing stride with number of filters
      2. Which of the following is the correct way to create a 2D convolutional layer in PyTorch with kernel size 3, stride 2, and padding 1?
      easy
      A. nn.Conv2d(in_channels=1, out_channels=10, kernel_size=3, stride=2, padding=1)
      B. nn.Conv2d(1, 10, kernel=3, stride=2, pad=1)
      C. nn.Conv2d(1, 10, kernel_size=3, stride=1, padding=2)
      D. nn.Conv2d(in_channels=1, out_channels=10, kernel_size=2, stride=2, padding=1)

      Solution

      1. Step 1: Check PyTorch Conv2d parameter names

        Correct parameters are in_channels, out_channels, kernel_size, stride, and padding.
      2. Step 2: Match values to question

        Kernel size=3, stride=2, padding=1 matches only nn.Conv2d(in_channels=1, out_channels=10, kernel_size=3, stride=2, padding=1) exactly.
      3. Final Answer:

        nn.Conv2d(in_channels=1, out_channels=10, kernel_size=3, stride=2, padding=1) -> Option A
      4. Quick Check:

        Correct parameter names and values used [OK]
      Hint: Use exact PyTorch parameter names: kernel_size, stride, padding [OK]
      Common Mistakes:
      • Using wrong parameter names like kernel or pad
      • Mixing stride and padding values
      • Wrong kernel size or stride values
      3. Given an input tensor of size (1, 1, 7, 7), a Conv2d layer with kernel_size=3, stride=2, and padding=1 is applied. What is the output spatial size (height and width)?
      medium
      A. (1, 1, 3, 3)
      B. (1, 1, 5, 5)
      C. (1, 1, 2, 2)
      D. (1, 1, 4, 4)

      Solution

      1. Step 1: Use output size formula for Conv2d

        Output size = floor((Input + 2*padding - kernel_size)/stride) + 1
      2. Step 2: Calculate output height and width

        Input=7, padding=1, kernel=3, stride=2
        Output = floor((7 + 2*1 - 3)/2) + 1 = floor((7 + 2 - 3)/2) + 1 = floor(6/2) + 1 = 3 + 1 = 4
      3. Final Answer:

        (1, 1, 4, 4) -> Option D
      4. Quick Check:

        Output size formula applied correctly [OK]
      Hint: Apply formula: floor((I+2P-K)/S)+1 for each dimension [OK]
      Common Mistakes:
      • Forgetting to add padding twice
      • Using ceil instead of floor
      • Mixing stride and kernel size in formula
      4. You wrote this PyTorch code but get an error:
      nn.Conv2d(1, 10, kernel_size=3, stride=2, padding=0)
      on input size (1, 1, 1, 1). What is the likely cause?
      medium
      A. Kernel size must be even number
      B. Padding is too small for the input size causing negative output dimension
      C. Stride value must be 1 for kernel size 3
      D. Input channels and output channels are swapped

      Solution

      1. Step 1: Check output size with given parameters

        Output size = floor((1 + 2*0 - 3)/2) + 1 = floor((-2)/2) + 1 = floor(-1) + 1 = -1 + 1 = 0 (invalid)
      2. Step 2: Consider if padding causes error

        Padding=0 on small input (1x1) causes the calculated output size to be zero, which PyTorch raises as a runtime error due to insufficient padding for the kernel size.
      3. Final Answer:

        Padding is too small for the input size causing negative output dimension -> Option B
      4. Quick Check:

        Padding too small -> invalid output size [OK]
      Hint: Check if padding is too small for input size [OK]
      Common Mistakes:
      • Assuming stride must be 1 for kernel 3
      • Thinking kernel size must be even
      • Swapping input and output channels
      5. You want to keep the output size the same as the input size (7x7) after a Conv2d layer with kernel_size=5 and stride=1. What padding value should you use?
      hard
      A. 1
      B. 0
      C. 2
      D. 3

      Solution

      1. Step 1: Use formula for output size with stride=1

        Output size = Input size if padding = (kernel_size - 1) / 2
      2. Step 2: Calculate padding

        Padding = (5 - 1) / 2 = 4 / 2 = 2
      3. Final Answer:

        2 -> Option C
      4. Quick Check:

        Padding = (kernel_size - 1)/2 keeps size same [OK]
      Hint: Padding = (kernel_size - 1) / 2 for same size with stride 1 [OK]
      Common Mistakes:
      • Using padding 0 or 1 incorrectly
      • Forgetting stride must be 1 for same size
      • Using padding larger than needed