2. Which of the following is the correct way to define a 2D convolutional layer in PyTorch with 3 input channels, 16 output channels, and a kernel size of 3?
easy
A. nn.Conv2d(16, 3, kernel_size=3)
B. nn.Conv1d(3, 16, kernel_size=3)
C. nn.Linear(3, 16, kernel_size=3)
D. nn.Conv2d(3, 16, kernel_size=3)
Solution
Step 1: Identify correct layer type and parameters
For images, use nn.Conv2d with input channels first, then output channels, and kernel size.
Step 2: Check each option
nn.Conv2d(3, 16, kernel_size=3) uses nn.Conv2d(3, 16, kernel_size=3) which is correct. nn.Conv1d(3, 16, kernel_size=3) uses Conv1d (wrong dimension). nn.Linear(3, 16, kernel_size=3) uses Linear (not convolution). nn.Conv2d(16, 3, kernel_size=3) reverses input/output channels.
Final Answer:
nn.Conv2d(3, 16, kernel_size=3) -> Option D
Quick Check:
Conv2d(input_channels, output_channels, kernel_size) = A [OK]
Hint: Conv2d uses (in_channels, out_channels, kernel_size) order [OK]
Common Mistakes:
Using Conv1d instead of Conv2d for images
Swapping input and output channels
Using Linear layer for convolution
3. Given the following PyTorch CNN snippet, what is the output shape after the convolution and pooling layers if the input image size is (3, 32, 32)?
import torch
import torch.nn as nn
class SimpleCNN(nn.Module):
def __init__(self):
super().__init__()
self.conv = nn.Conv2d(3, 8, kernel_size=3, padding=1)
self.pool = nn.MaxPool2d(2, 2)
def forward(self, x):
x = self.conv(x)
x = self.pool(x)
return x
model = SimpleCNN()
input_tensor = torch.randn(1, 3, 32, 32)
output = model(input_tensor)
print(output.shape)
MaxPool2d with kernel=2, stride=2 halves width and height: 32/2 = 16. Channels remain 8.
Final Answer:
torch.Size([1, 8, 16, 16]) -> Option B
Quick Check:
Conv keeps size, pooling halves it = B [OK]
Hint: Conv with padding keeps size; pooling halves it [OK]
Common Mistakes:
Ignoring padding effect on convolution output size
Forgetting pooling halves spatial dimensions
Mixing up input and output channels
4. Identify the error in this PyTorch CNN model definition for image classification:
import torch.nn as nn
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(3, 16, 3)
self.pool = nn.MaxPool2d(2, 2)
self.fc1 = nn.Linear(16 * 15 * 15, 10)
def forward(self, x):
x = self.pool(F.relu(self.conv1(x)))
x = x.view(-1, 16 * 15 * 15)
x = self.fc1(x)
return x
medium
A. Pooling layer should come before convolution
B. The input size to fc1 is incorrect due to convolution output size mismatch
C. Missing import for torch.nn.functional as F
D. The number of output classes in fc1 should be 16
Solution
Step 1: Check imports and usage
The forward method uses F.relu but torch.nn.functional as F is not imported, causing a NameError.
Step 2: Verify other parts
Input size to fc1 assumes input image size 32x32 with kernel=3 and no padding, output size after conv and pool is 15x15, so fc1 input size is correct. Pooling after conv is correct. Output classes 10 is reasonable.
Final Answer:
Missing import for torch.nn.functional as F -> Option C
Quick Check:
Using F.relu without import = A [OK]
Hint: Check all used modules are imported [OK]
Common Mistakes:
Forgetting to import torch.nn.functional as F
Miscalculating fc1 input size
Changing layer order incorrectly
5. You want to build a CNN in PyTorch to classify 64x64 RGB images into 5 classes. Which architecture below correctly combines convolution, pooling, and fully connected layers to achieve this?
hard
A.
class CNN(nn.Module):
def __init__(self):
super().__init__()
self.conv1 = nn.Conv2d(3, 10, 5)
self.pool = nn.MaxPool2d(2, 2)
self.conv2 = nn.Conv2d(10, 20, 5)
self.fc1 = nn.Linear(20 * 13 * 13, 50)
self.fc2 = nn.Linear(50, 5)
def forward(self, x):
x = self.pool(F.relu(self.conv1(x)))
x = self.pool(F.relu(self.conv2(x)))
x = x.view(-1, 20 * 13 * 13)
x = F.relu(self.fc1(x))
x = self.fc2(x)
return x
B.
class CNN(nn.Module):
def __init__(self):
super().__init__()
self.conv1 = nn.Conv2d(3, 10, 3)
self.pool = nn.MaxPool2d(2, 2)
self.fc1 = nn.Linear(10 * 32 * 32, 5)
def forward(self, x):
x = self.pool(F.relu(self.conv1(x)))
x = x.view(-1, 10 * 32 * 32)
x = self.fc1(x)
return x
C.
class CNN(nn.Module):
def __init__(self):
super().__init__()
self.conv1 = nn.Conv2d(3, 10, 5)
self.pool = nn.MaxPool2d(2, 2)
self.conv2 = nn.Conv2d(10, 20, 5)
self.fc1 = nn.Linear(20 * 12 * 12, 50)
self.fc2 = nn.Linear(50, 5)
def forward(self, x):
x = self.pool(F.relu(self.conv1(x)))
x = self.pool(F.relu(self.conv2(x)))
x = x.view(-1, 20 * 12 * 12)
x = F.relu(self.fc1(x))
x = self.fc2(x)
return x
D.
class CNN(nn.Module):
def __init__(self):
super().__init__()
self.conv1 = nn.Conv2d(3, 10, 5)
self.pool = nn.MaxPool2d(2, 2)
self.conv2 = nn.Conv2d(10, 20, 5)
self.fc1 = nn.Linear(20 * 14 * 14, 50)
self.fc2 = nn.Linear(50, 5)
def forward(self, x):
x = self.pool(F.relu(self.conv1(x)))
x = self.pool(F.relu(self.conv2(x)))
x = x.view(-1, 20 * 14 * 14)
x = F.relu(self.fc1(x))
x = self.fc2(x)
return x
Solution
Step 1: Calculate output sizes after conv and pooling layers
Input: 64x64. Conv1 kernel=5, padding=0: (64-5+1)=60, pool kernel=2 stride=2: 60/2=30. Conv2 kernel=5: (30-5+1)=26, pool: 26/2=13. Final size 20x13x13.
Step 2: Check fc1 input sizes
class CNN(nn.Module):
def __init__(self):
super().__init__()
self.conv1 = nn.Conv2d(3, 10, 5)
self.pool = nn.MaxPool2d(2, 2)
self.conv2 = nn.Conv2d(10, 20, 5)
self.fc1 = nn.Linear(20 * 13 * 13, 50)
self.fc2 = nn.Linear(50, 5)
def forward(self, x):
x = self.pool(F.relu(self.conv1(x)))
x = self.pool(F.relu(self.conv2(x)))
x = x.view(-1, 20 * 13 * 13)
x = F.relu(self.fc1(x))
x = self.fc2(x)
return x
: 20*13*13 correct.
class CNN(nn.Module):
def __init__(self):
super().__init__()
self.conv1 = nn.Conv2d(3, 10, 3)
self.pool = nn.MaxPool2d(2, 2)
self.fc1 = nn.Linear(10 * 32 * 32, 5)
def forward(self, x):
x = self.pool(F.relu(self.conv1(x)))
x = x.view(-1, 10 * 32 * 32)
x = self.fc1(x)
return x
: single conv kernel=3 gives ~10*31*31 but uses 10*32*32 wrong.
class CNN(nn.Module):
def __init__(self):
super().__init__()
self.conv1 = nn.Conv2d(3, 10, 5)
self.pool = nn.MaxPool2d(2, 2)
self.conv2 = nn.Conv2d(10, 20, 5)
self.fc1 = nn.Linear(20 * 12 * 12, 50)
self.fc2 = nn.Linear(50, 5)
def forward(self, x):
x = self.pool(F.relu(self.conv1(x)))
x = self.pool(F.relu(self.conv2(x)))
x = x.view(-1, 20 * 12 * 12)
x = F.relu(self.fc1(x))
x = self.fc2(x)
return x
: 20*12*12 too small.
class CNN(nn.Module):
def __init__(self):
super().__init__()
self.conv1 = nn.Conv2d(3, 10, 5)
self.pool = nn.MaxPool2d(2, 2)
self.conv2 = nn.Conv2d(10, 20, 5)
self.fc1 = nn.Linear(20 * 14 * 14, 50)
self.fc2 = nn.Linear(50, 5)
def forward(self, x):
x = self.pool(F.relu(self.conv1(x)))
x = self.pool(F.relu(self.conv2(x)))
x = x.view(-1, 20 * 14 * 14)
x = F.relu(self.fc1(x))
x = self.fc2(x)
return x
: 20*14*14 too big.
Final Answer:
nn.Linear(20 * 13 * 13, 50) -> Option A
Quick Check:
64->60->30->26->13 = 20x13x13 -> A [OK]
Hint: Calculate conv and pool sizes stepwise to find fc input size [OK]