2. Which of the following is the correct way to define a 2D convolutional layer in PyTorch with 3 input channels, 16 output channels, and a kernel size of 3?

easy

A. nn.Conv2d(16, 3, kernel_size=3)

B. nn.Conv1d(3, 16, kernel_size=3)

C. nn.Linear(3, 16, kernel_size=3)

D. nn.Conv2d(3, 16, kernel_size=3)

3. Given the following PyTorch CNN snippet, what is the output shape after the convolution and pooling layers if the input image size is (3, 32, 32)?

import torch
import torch.nn as nn

class SimpleCNN(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv = nn.Conv2d(3, 8, kernel_size=3, padding=1)
        self.pool = nn.MaxPool2d(2, 2)
    def forward(self, x):
        x = self.conv(x)
        x = self.pool(x)
        return x

model = SimpleCNN()
input_tensor = torch.randn(1, 3, 32, 32)
output = model(input_tensor)
print(output.shape)

medium

A. torch.Size([1, 8, 30, 30])

B. torch.Size([1, 8, 16, 16])

C. torch.Size([1, 3, 16, 16])

D. torch.Size([1, 8, 32, 32])

4. Identify the error in this PyTorch CNN model definition for image classification:

import torch.nn as nn

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(3, 16, 3)
        self.pool = nn.MaxPool2d(2, 2)
        self.fc1 = nn.Linear(16 * 15 * 15, 10)

    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = x.view(-1, 16 * 15 * 15)
        x = self.fc1(x)
        return x

medium

A. Pooling layer should come before convolution

B. The input size to fc1 is incorrect due to convolution output size mismatch

C. Missing import for torch.nn.functional as F

D. The number of output classes in fc1 should be 16

5. You want to build a CNN in PyTorch to classify 64x64 RGB images into 5 classes. Which architecture below correctly combines convolution, pooling, and fully connected layers to achieve this?

hard

A.

class CNN(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(3, 10, 5)
        self.pool = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(10, 20, 5)
        self.fc1 = nn.Linear(20 * 13 * 13, 50)
        self.fc2 = nn.Linear(50, 5)
    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = x.view(-1, 20 * 13 * 13)
        x = F.relu(self.fc1(x))
        x = self.fc2(x)
        return x

B.

class CNN(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(3, 10, 3)
        self.pool = nn.MaxPool2d(2, 2)
        self.fc1 = nn.Linear(10 * 32 * 32, 5)
    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = x.view(-1, 10 * 32 * 32)
        x = self.fc1(x)
        return x

C.

class CNN(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(3, 10, 5)
        self.pool = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(10, 20, 5)
        self.fc1 = nn.Linear(20 * 12 * 12, 50)
        self.fc2 = nn.Linear(50, 5)
    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = x.view(-1, 20 * 12 * 12)
        x = F.relu(self.fc1(x))
        x = self.fc2(x)
        return x

D.

class CNN(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(3, 10, 5)
        self.pool = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(10, 20, 5)
        self.fc1 = nn.Linear(20 * 14 * 14, 50)
        self.fc2 = nn.Linear(50, 5)
    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = x.view(-1, 20 * 14 * 14)
        x = F.relu(self.fc1(x))
        x = self.fc2(x)
        return x

Solution

Step 1: Calculate output sizes after conv and pooling layers
Input: 64x64. Conv1 kernel=5, padding=0: (64-5+1)=60, pool kernel=2 stride=2: 60/2=30. Conv2 kernel=5: (30-5+1)=26, pool: 26/2=13. Final size 20x13x13.

Step 2: Check fc1 input sizes

class CNN(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(3, 10, 5)
        self.pool = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(10, 20, 5)
        self.fc1 = nn.Linear(20 * 13 * 13, 50)
        self.fc2 = nn.Linear(50, 5)
    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = x.view(-1, 20 * 13 * 13)
        x = F.relu(self.fc1(x))
        x = self.fc2(x)
        return x

: 20*13*13 correct.

class CNN(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(3, 10, 3)
        self.pool = nn.MaxPool2d(2, 2)
        self.fc1 = nn.Linear(10 * 32 * 32, 5)
    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = x.view(-1, 10 * 32 * 32)
        x = self.fc1(x)
        return x

: single conv kernel=3 gives ~10*31*31 but uses 10*32*32 wrong.

class CNN(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(3, 10, 5)
        self.pool = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(10, 20, 5)
        self.fc1 = nn.Linear(20 * 12 * 12, 50)
        self.fc2 = nn.Linear(50, 5)
    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = x.view(-1, 20 * 12 * 12)
        x = F.relu(self.fc1(x))
        x = self.fc2(x)
        return x

: 20*12*12 too small.

class CNN(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(3, 10, 5)
        self.pool = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(10, 20, 5)
        self.fc1 = nn.Linear(20 * 14 * 14, 50)
        self.fc2 = nn.Linear(50, 5)
    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = x.view(-1, 20 * 14 * 14)
        x = F.relu(self.fc1(x))
        x = self.fc2(x)
        return x

: 20*14*14 too big.

Final Answer:
nn.Linear(20 * 13 * 13, 50) -> Option A
Quick Check:
64->60->30->26->13 = 20x13x13 -> A [OK]

Hint: Calculate conv and pool sizes stepwise to find fc input size [OK]

Common Mistakes:

Ignoring how kernel size reduces image dimensions
Assuming pooling does not halve size
Mismatching fc layer input size with conv output

Why CNN architecture for image classification in PyTorch? - Purpose & Use Cases

Start learning this pattern below

Practice

Solution

Step 1: Understand convolutional layers

Step 2: Compare with other layers

Final Answer:

Quick Check:

Solution

Step 1: Identify correct layer type and parameters

Step 2: Check each option

Final Answer:

Quick Check:

Solution

Step 1: Calculate output size after convolution

Step 2: Calculate output size after max pooling

Final Answer:

Quick Check:

Solution

Step 1: Check imports and usage

Step 2: Verify other parts

Final Answer:

Quick Check:

Solution

Step 1: Calculate output sizes after conv and pooling layers

Step 2: Check fc1 input sizes

Final Answer:

Quick Check: