Recall & Review

beginner

What is the main purpose of a Convolutional Neural Network (CNN) in image classification?

A CNN automatically learns to detect important features like edges, shapes, and textures from images to classify them into categories.

Click to reveal answer

beginner

What does a convolutional layer do in a CNN?

It applies small filters to the input image to create feature maps that highlight important patterns like edges or textures.

Click to reveal answer

beginner

Why do CNNs use pooling layers?

Pooling layers reduce the size of feature maps, making the model faster and helping it focus on the most important features.

Click to reveal answer

beginner

What role does the fully connected layer play in a CNN for image classification?

It takes the extracted features and decides which class the image belongs to by combining all the information.

Click to reveal answer

beginner

How is accuracy calculated during CNN training for image classification?

Accuracy is the percentage of images the CNN correctly classifies out of all images tested.

Click to reveal answer

What is the first layer usually used in a CNN for image classification?

AConvolutional layer

BPooling layer

CFully connected layer

DDropout layer

Which layer reduces the spatial size of the feature maps?

AConvolutional layer

BBatch normalization layer

CFully connected layer

DPooling layer

What does the output layer of a CNN for classification usually use?

ASoftmax activation

BSigmoid activation

CReLU activation

DTanh activation

Which metric tells how many images were correctly classified?

ALoss

BAccuracy

CPrecision

DRecall

What is the main advantage of using convolutional layers over fully connected layers for images?

AThey remove noise from images

BThey increase the image size

CThey reduce the number of parameters by sharing weights

DThey convert images to text

Explain the main components of a CNN architecture used for image classification and their roles.

Describe how accuracy is calculated during CNN training and why it is important.

Practice

(1/5)

1. What is the main role of convolutional layers in a CNN for image classification?

easy

A. To detect features like edges and textures in small parts of the image

B. To reduce the size of the image by downsampling

C. To combine all features into a final decision

D. To randomly change pixel values for data augmentation

5. You want to build a CNN in PyTorch to classify 64x64 RGB images into 5 classes. Which architecture below correctly combines convolution, pooling, and fully connected layers to achieve this?

hard

class CNN(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(3, 10, 5)
        self.pool = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(10, 20, 5)
        self.fc1 = nn.Linear(20 * 13 * 13, 50)
        self.fc2 = nn.Linear(50, 5)
    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = x.view(-1, 20 * 13 * 13)
        x = F.relu(self.fc1(x))
        x = self.fc2(x)
        return x

class CNN(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(3, 10, 3)
        self.pool = nn.MaxPool2d(2, 2)
        self.fc1 = nn.Linear(10 * 32 * 32, 5)
    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = x.view(-1, 10 * 32 * 32)
        x = self.fc1(x)
        return x

class CNN(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(3, 10, 5)
        self.pool = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(10, 20, 5)
        self.fc1 = nn.Linear(20 * 12 * 12, 50)
        self.fc2 = nn.Linear(50, 5)
    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = x.view(-1, 20 * 12 * 12)
        x = F.relu(self.fc1(x))
        x = self.fc2(x)
        return x

class CNN(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(3, 10, 5)
        self.pool = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(10, 20, 5)
        self.fc1 = nn.Linear(20 * 14 * 14, 50)
        self.fc2 = nn.Linear(50, 5)
    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = x.view(-1, 20 * 14 * 14)
        x = F.relu(self.fc1(x))
        x = self.fc2(x)
        return x

Solution

Step 1: Calculate output sizes after conv and pooling layers
Input: 64x64. Conv1 kernel=5, padding=0: (64-5+1)=60, pool kernel=2 stride=2: 60/2=30. Conv2 kernel=5: (30-5+1)=26, pool: 26/2=13. Final size 20x13x13.

Step 2: Check fc1 input sizes

class CNN(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(3, 10, 5)
        self.pool = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(10, 20, 5)
        self.fc1 = nn.Linear(20 * 13 * 13, 50)
        self.fc2 = nn.Linear(50, 5)
    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = x.view(-1, 20 * 13 * 13)
        x = F.relu(self.fc1(x))
        x = self.fc2(x)
        return x

: 20*13*13 correct.

class CNN(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(3, 10, 3)
        self.pool = nn.MaxPool2d(2, 2)
        self.fc1 = nn.Linear(10 * 32 * 32, 5)
    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = x.view(-1, 10 * 32 * 32)
        x = self.fc1(x)
        return x

: single conv kernel=3 gives ~10*31*31 but uses 10*32*32 wrong.

class CNN(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(3, 10, 5)
        self.pool = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(10, 20, 5)
        self.fc1 = nn.Linear(20 * 12 * 12, 50)
        self.fc2 = nn.Linear(50, 5)
    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = x.view(-1, 20 * 12 * 12)
        x = F.relu(self.fc1(x))
        x = self.fc2(x)
        return x

: 20*12*12 too small.

class CNN(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(3, 10, 5)
        self.pool = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(10, 20, 5)
        self.fc1 = nn.Linear(20 * 14 * 14, 50)
        self.fc2 = nn.Linear(50, 5)
    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = x.view(-1, 20 * 14 * 14)
        x = F.relu(self.fc1(x))
        x = self.fc2(x)
        return x

: 20*14*14 too big.

Final Answer:
nn.Linear(20 * 13 * 13, 50) -> Option A
Quick Check:
64->60->30->26->13 = 20x13x13 -> A [OK]

Hint: Calculate conv and pool sizes stepwise to find fc input size [OK]

Common Mistakes:

Ignoring how kernel size reduces image dimensions
Assuming pooling does not halve size
Mismatching fc layer input size with conv output

CNN architecture for image classification in PyTorch - Cheat Sheet & Quick Revision

Start learning this pattern below

Practice

Solution

Step 1: Understand convolutional layers

Step 2: Compare with other layers

Final Answer:

Quick Check:

Solution

Step 1: Identify correct layer type and parameters

Step 2: Check each option

Final Answer:

Quick Check:

Solution

Step 1: Calculate output size after convolution

Step 2: Calculate output size after max pooling

Final Answer:

Quick Check:

Solution

Step 1: Check imports and usage

Step 2: Verify other parts

Final Answer:

Quick Check:

Solution

Step 1: Calculate output sizes after conv and pooling layers

Step 2: Check fc1 input sizes

Final Answer:

Quick Check: