Bird
Raised Fist0
PyTorchml~5 mins

Replacing classifier head in PyTorch - Cheat Sheet & Quick Revision

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Recall & Review
beginner
What does 'replacing the classifier head' mean in a neural network?
It means changing the last layer(s) of a pre-trained model to fit a new task, like changing the output to match new classes.
Click to reveal answer
beginner
Why do we replace the classifier head instead of retraining the whole model?
Because the earlier layers already learned useful features, so we only need to adjust the last part to the new task, saving time and data.
Click to reveal answer
intermediate
In PyTorch, which module usually represents the classifier head in models like ResNet?
The 'fc' (fully connected) layer is usually the classifier head in ResNet models.
Click to reveal answer
intermediate
How do you replace the classifier head in a PyTorch model?
You assign a new layer to the model's classifier attribute, for example: model.fc = nn.Linear(in_features, num_classes).
Click to reveal answer
intermediate
What should you consider about the input features when replacing the classifier head?
The new classifier's input size must match the output size of the previous layer to connect properly.
Click to reveal answer
What is the main reason to replace the classifier head in a pre-trained model?
ATo adapt the model to a new number of output classes
BTo change the input image size
CTo speed up the GPU
DTo reduce the number of layers
In PyTorch's ResNet, which attribute is replaced to change the classifier head?
Amodel.fc
Bmodel.conv1
Cmodel.layer1
Dmodel.avgpool
If the original classifier outputs 1000 classes, and your new task has 10 classes, what should you do?
AKeep the original classifier
BChange the input image size
CReplace the classifier head with output size 10
DAdd more layers before the classifier
What PyTorch module is commonly used to create a new classifier head?
Ann.Conv2d
Bnn.MaxPool2d
Cnn.ReLU
Dnn.Linear
What must match between the old and new classifier head when replacing it?
AOutput size of the new head and input size of the previous layer
BInput size of the new head and output size of the previous layer
CNumber of layers in the model
DLearning rate
Explain how and why you would replace the classifier head in a pre-trained PyTorch model.
Think about transfer learning and output classes.
You got /4 concepts.
    Describe the steps to ensure the new classifier head connects correctly to the rest of the model.
    Focus on layer sizes and connections.
    You got /4 concepts.

      Practice

      (1/5)
      1. What is the main reason to replace the classifier head in a pretrained PyTorch model?
      easy
      A. To adapt the model to a new task with different output classes
      B. To speed up the training by removing layers
      C. To reduce the model size by deleting layers
      D. To change the input image size the model accepts

      Solution

      1. Step 1: Understand the classifier head role

        The classifier head is the last layer that decides the output classes based on learned features.
      2. Step 2: Reason about adapting to new tasks

        Replacing the classifier head allows the model to output predictions for new classes different from the original training.
      3. Final Answer:

        To adapt the model to a new task with different output classes -> Option A
      4. Quick Check:

        Classifier head replacement = new task adaptation [OK]
      Hint: Classifier head controls output classes, replace for new tasks [OK]
      Common Mistakes:
      • Thinking replacing head changes input size
      • Assuming it reduces model size significantly
      • Believing it speeds up training by removing layers
      2. Which of the following is the correct way to replace the classifier head of a pretrained ResNet model in PyTorch for 10 output classes?
      easy
      A. model.fc = nn.Linear(2048, 10)
      B. model.classifier = nn.Linear(2048, 10)
      C. model.fc = nn.Linear(512, 10)
      D. model.head = nn.Linear(512, 10)

      Solution

      1. Step 1: Identify ResNet classifier attribute

        ResNet models use model.fc as the classifier head.
      2. Step 2: Check input feature size for ResNet

        ResNet50 and similar have 2048 features before the classifier, so input size is 2048.
      3. Final Answer:

        model.fc = nn.Linear(2048, 10) -> Option A
      4. Quick Check:

        ResNet classifier = model.fc with 2048 input features [OK]
      Hint: ResNet classifier is model.fc with 2048 input features [OK]
      Common Mistakes:
      • Using wrong attribute like model.classifier or model.head
      • Using wrong input size like 512 instead of 2048
      • Confusing ResNet with other models like VGG
      3. Given the code below, what will be the output shape of the model's final layer after replacement?
      import torch
      import torch.nn as nn
      from torchvision import models
      
      model = models.resnet18(pretrained=True)
      model.fc = nn.Linear(512, 5)
      
      input_tensor = torch.randn(1, 3, 224, 224)
      output = model(input_tensor)
      print(output.shape)
      medium
      A. torch.Size([1, 1000])
      B. torch.Size([1, 512])
      C. torch.Size([1, 5])
      D. torch.Size([3, 224, 224])

      Solution

      1. Step 1: Understand the replaced classifier output size

        The new classifier layer outputs 5 values per input (5 classes).
      2. Step 2: Check input batch size and output shape

        Input batch size is 1, so output shape is (1, 5).
      3. Final Answer:

        torch.Size([1, 5]) -> Option C
      4. Quick Check:

        Output shape = (batch_size, output_classes) = (1, 5) [OK]
      Hint: Output shape matches batch size and new class count [OK]
      Common Mistakes:
      • Expecting original 1000 classes output
      • Confusing feature size with output size
      • Misreading input tensor shape as output
      4. You tried replacing the classifier head of a pretrained model with model.fc = nn.Linear(1024, 10) but got a runtime error during training. What is the most likely cause?
      medium
      A. The model.fc attribute does not exist in pretrained models
      B. The output size 10 is too large for the model
      C. You forgot to call model.eval() before training
      D. The input feature size 1024 does not match the model's actual output features

      Solution

      1. Step 1: Check input feature size for classifier

        The input size to the new Linear layer must match the output features of the previous layer.
      2. Step 2: Identify mismatch causing runtime error

        If 1024 is incorrect, the model will raise size mismatch errors during forward pass.
      3. Final Answer:

        The input feature size 1024 does not match the model's actual output features -> Option D
      4. Quick Check:

        Input size mismatch causes runtime error [OK]
      Hint: Match Linear input size to previous layer output features [OK]
      Common Mistakes:
      • Assuming output size causes error
      • Confusing eval mode with training errors
      • Thinking model.fc is missing in pretrained models
      5. You want to fine-tune a pretrained ResNet50 on a dataset with 15 classes. Which code snippet correctly replaces the classifier head and freezes all layers except the new head?
      hard
      A. model = models.resnet50(pretrained=True) model.fc = nn.Linear(2048, 15) for param in model.parameters(): param.requires_grad = False
      B. model = models.resnet50(pretrained=True) for param in model.parameters(): param.requires_grad = False model.fc = nn.Linear(2048, 15)
      C. model = models.resnet50(pretrained=True) for param in model.fc.parameters(): param.requires_grad = False model.fc = nn.Linear(2048, 15)
      D. model = models.resnet50(pretrained=True) model.fc = nn.Linear(512, 15) for param in model.parameters(): param.requires_grad = True

      Solution

      1. Step 1: Freeze all existing model parameters

        Set param.requires_grad = False for all parameters to prevent updates during training.
      2. Step 2: Replace classifier head with correct input/output sizes

        ResNet50's classifier input size is 2048; output size is 15 for new classes.
      3. Step 3: Ensure new head parameters are trainable

        By replacing model.fc after freezing, new layer parameters default to requires_grad=True.
      4. Final Answer:

        Freeze all params, then replace head with nn.Linear(2048, 15) -> Option B
      5. Quick Check:

        Freeze old layers, replace head with correct sizes [OK]
      Hint: Freeze before replacing head to keep new layer trainable [OK]
      Common Mistakes:
      • Freezing after replacing head disables new layer training
      • Using wrong input size 512 instead of 2048
      • Not freezing any layers when fine-tuning