Practice

(1/5)

1. Why do CNNs use small filters that slide over an image?

easy

A. To detect local spatial patterns like edges and textures

B. To reduce the image size drastically in one step

C. To convert images into text data

D. To randomly change pixel colors

Solution

Step 1: Understand the role of filters in CNNs
Filters slide over small parts of the image to focus on local details like edges or shapes.
Step 2: Connect filter behavior to spatial pattern detection
By scanning the image locally, filters learn to recognize important spatial features that help in tasks like image recognition.
Final Answer:
To detect local spatial patterns like edges and textures -> Option A
Quick Check:
Filters detect local patterns = A [OK]

Hint: Filters scan small areas to find edges and shapes [OK]

Common Mistakes:

Thinking filters change image size drastically in one step
Believing CNNs convert images to text directly
Assuming filters randomly alter pixel colors

2. Which PyTorch code correctly creates a 2D convolutional layer with a 3x3 filter?

easy

A. torch.nn.Conv2d(in_channels=1, out_channels=10, kernel_size=3)

B. torch.nn.Conv1d(in_channels=1, out_channels=10, kernel_size=3)

C. torch.nn.Linear(in_features=3, out_features=10)

D. torch.nn.Conv2d(in_channels=1, out_channels=10, kernel_size=5)

Solution

Step 1: Identify the correct convolution layer type
For images, 2D convolution (Conv2d) is used, not Conv1d or Linear layers.
Step 2: Check the kernel size matches 3x3
kernel_size=3 means a 3x3 filter, so torch.nn.Conv2d(in_channels=1, out_channels=10, kernel_size=3) is correct; torch.nn.Conv2d(in_channels=1, out_channels=10, kernel_size=5) uses 5x5.
Final Answer:
torch.nn.Conv2d(in_channels=1, out_channels=10, kernel_size=3) -> Option A
Quick Check:
Conv2d with kernel_size=3 = D [OK]

Hint: Use Conv2d and kernel_size=3 for 3x3 filters [OK]

Common Mistakes:

Using Conv1d instead of Conv2d for images
Confusing Linear layers with convolution layers
Setting wrong kernel size for the filter

3. Given this PyTorch code snippet, what is the output shape after the convolution?

import torch
conv = torch.nn.Conv2d(1, 1, kernel_size=3)
input = torch.randn(1, 1, 5, 5)
output = conv(input)
print(output.shape)

medium

A. torch.Size([1, 1, 5, 5])

B. torch.Size([1, 3, 3, 3])

C. torch.Size([1, 1, 7, 7])

D. torch.Size([1, 1, 3, 3])

Solution

Step 1: Understand convolution output size formula
Output size = Input size - Kernel size + 1 (assuming stride=1, padding=0). Here, 5 - 3 + 1 = 3.
Step 2: Apply formula to each spatial dimension
Both height and width become 3, so output shape is (1 batch, 1 channel, 3 height, 3 width).
Final Answer:
torch.Size([1, 1, 3, 3]) -> Option D
Quick Check:
Output size = 5-3+1 = 3 [OK]

Hint: Output size = input - kernel + 1 if no padding [OK]

Common Mistakes:

Assuming output size equals input size without padding
Confusing batch and channel dimensions
Misapplying kernel size in output calculation

4. What is wrong with this PyTorch code for a convolutional layer?

conv = torch.nn.Conv2d(in_channels=3, out_channels=6, kernel_size=3)
input = torch.randn(1, 1, 28, 28)
output = conv(input)
print(output.shape)

medium

A. Output channels must be less than input channels

B. Kernel size is too large for the input

C. Input channels do not match the layer's in_channels

D. Batch size must be greater than 1

Solution

Step 1: Check input and layer channel compatibility
The layer expects 3 input channels, but input has only 1 channel, causing a mismatch error.
Step 2: Confirm other parameters are valid
Kernel size 3 is valid for 28x28 input, output channels can be any positive number, batch size 1 is allowed.
Final Answer:
Input channels do not match the layer's in_channels -> Option C
Quick Check:
Input channels mismatch = A [OK]

Hint: Input channels must match Conv2d in_channels [OK]

Common Mistakes:

Ignoring channel mismatch errors
Thinking kernel size is invalid for input
Believing batch size must be >1

5. How does using multiple convolutional layers help CNNs detect complex spatial patterns?

hard

A. Layers randomly shuffle pixels to create new patterns

B. Each layer learns higher-level features by combining simpler patterns from previous layers

C. Multiple layers reduce the image size to zero quickly

D. Each layer independently detects the same simple edges

Solution

Step 1: Understand feature hierarchy in CNNs
Early layers detect simple features like edges; later layers combine these to form complex shapes and objects.
Step 2: Explain how multiple layers build complexity
Stacking layers lets the network learn spatial patterns at increasing levels of abstraction, improving recognition.
Final Answer:
Each layer learns higher-level features by combining simpler patterns from previous layers -> Option B
Quick Check:
Layer stacking builds complex features = C [OK]

Hint: Layers build complexity by combining simpler features [OK]

Common Mistakes:

Thinking layers just reduce image size quickly
Believing layers shuffle pixels randomly
Assuming all layers detect the same simple edges

Why CNNs detect spatial patterns in PyTorch - The Real Reasons

Start learning this pattern below

Practice

Solution

Step 1: Understand the role of filters in CNNs

Step 2: Connect filter behavior to spatial pattern detection

Final Answer:

Quick Check:

Solution

Step 1: Identify the correct convolution layer type

Step 2: Check the kernel size matches 3x3

Final Answer:

Quick Check:

Solution

Step 1: Understand convolution output size formula

Step 2: Apply formula to each spatial dimension

Final Answer:

Quick Check:

Solution

Step 1: Check input and layer channel compatibility

Step 2: Confirm other parameters are valid

Final Answer:

Quick Check:

Solution

Step 1: Understand feature hierarchy in CNNs

Step 2: Explain how multiple layers build complexity

Final Answer:

Quick Check: