We replace the classifier head to change what the model predicts. This helps when using a pre-trained model for a new task with different output classes.
Replacing classifier head in PyTorch
Start learning this pattern below
Jump into concepts and practice - no test required
model.classifier = torch.nn.Linear(in_features, out_features)
model.classifier is the part of the model that makes final predictions.
in_features must match the output size of the previous layer.
model.classifier = torch.nn.Linear(512, 10)
fc. This replaces it to output 5 classes.model.fc = torch.nn.Linear(2048, 5)
This code loads a pre-trained ResNet18 model, replaces its classifier head to output 3 classes, and runs a dummy input through it. It prints the original classifier size, the output shape, and the output values.
import torch import torch.nn as nn import torchvision.models as models # Load a pre-trained ResNet18 model model = models.resnet18(pretrained=True) # Check original classifier (fc) output features print(f'Original classifier output features: {model.fc.out_features}') # Replace the classifier head to output 3 classes model.fc = nn.Linear(model.fc.in_features, 3) # Create dummy input tensor (batch size 1, 3 color channels, 224x224 image) dummy_input = torch.randn(1, 3, 224, 224) # Get model output output = model(dummy_input) # Print output shape and values print(f'Output shape: {output.shape}') print(f'Output values: {output}')
Always match the input features of the new classifier to the previous layer's output size.
Replacing the classifier head is common in transfer learning.
After replacing, you usually need to train or fine-tune the new head.
Replacing the classifier head lets you adapt a model to new tasks.
Use the correct input and output sizes for the new layer.
This is a key step in transfer learning with PyTorch models.
Practice
Solution
Step 1: Understand the classifier head role
The classifier head is the last layer that decides the output classes based on learned features.Step 2: Reason about adapting to new tasks
Replacing the classifier head allows the model to output predictions for new classes different from the original training.Final Answer:
To adapt the model to a new task with different output classes -> Option AQuick Check:
Classifier head replacement = new task adaptation [OK]
- Thinking replacing head changes input size
- Assuming it reduces model size significantly
- Believing it speeds up training by removing layers
Solution
Step 1: Identify ResNet classifier attribute
ResNet models usemodel.fcas the classifier head.Step 2: Check input feature size for ResNet
ResNet50 and similar have 2048 features before the classifier, so input size is 2048.Final Answer:
model.fc = nn.Linear(2048, 10) -> Option AQuick Check:
ResNet classifier = model.fc with 2048 input features [OK]
- Using wrong attribute like model.classifier or model.head
- Using wrong input size like 512 instead of 2048
- Confusing ResNet with other models like VGG
import torch import torch.nn as nn from torchvision import models model = models.resnet18(pretrained=True) model.fc = nn.Linear(512, 5) input_tensor = torch.randn(1, 3, 224, 224) output = model(input_tensor) print(output.shape)
Solution
Step 1: Understand the replaced classifier output size
The new classifier layer outputs 5 values per input (5 classes).Step 2: Check input batch size and output shape
Input batch size is 1, so output shape is (1, 5).Final Answer:
torch.Size([1, 5]) -> Option CQuick Check:
Output shape = (batch_size, output_classes) = (1, 5) [OK]
- Expecting original 1000 classes output
- Confusing feature size with output size
- Misreading input tensor shape as output
model.fc = nn.Linear(1024, 10) but got a runtime error during training. What is the most likely cause?Solution
Step 1: Check input feature size for classifier
The input size to the new Linear layer must match the output features of the previous layer.Step 2: Identify mismatch causing runtime error
If 1024 is incorrect, the model will raise size mismatch errors during forward pass.Final Answer:
The input feature size 1024 does not match the model's actual output features -> Option DQuick Check:
Input size mismatch causes runtime error [OK]
- Assuming output size causes error
- Confusing eval mode with training errors
- Thinking model.fc is missing in pretrained models
Solution
Step 1: Freeze all existing model parameters
Setparam.requires_grad = Falsefor all parameters to prevent updates during training.Step 2: Replace classifier head with correct input/output sizes
ResNet50's classifier input size is 2048; output size is 15 for new classes.Step 3: Ensure new head parameters are trainable
By replacingmodel.fcafter freezing, new layer parameters default torequires_grad=True.Final Answer:
Freeze all params, then replace head with nn.Linear(2048, 15) -> Option BQuick Check:
Freeze old layers, replace head with correct sizes [OK]
- Freezing after replacing head disables new layer training
- Using wrong input size 512 instead of 2048
- Not freezing any layers when fine-tuning
