Fine-tuning helps a pre-trained model learn new tasks faster by adjusting it slightly instead of starting from scratch.
Fine-tuning strategy in PyTorch
Start learning this pattern below
Jump into concepts and practice - no test required
or
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Introduction
Syntax
PyTorch
import torchvision.models as models model = models.resnet18(pretrained=True) for param in model.parameters(): param.requires_grad = False model.fc = torch.nn.Linear(model.fc.in_features, num_classes) optimizer = torch.optim.Adam(model.fc.parameters(), lr=0.001) # Training loop only updates model.fc parameters
Set requires_grad = False to freeze layers you don't want to change.
Replace the last layer to match your new task's output classes.
Examples
PyTorch
import torchvision.models as models model = models.resnet50(pretrained=True) for param in model.parameters(): param.requires_grad = False model.fc = torch.nn.Linear(model.fc.in_features, 10) # 10 classes optimizer = torch.optim.SGD(model.fc.parameters(), lr=0.01)
PyTorch
for name, param in model.named_parameters(): if 'layer4' in name: param.requires_grad = True else: param.requires_grad = False
PyTorch
optimizer = torch.optim.Adam(filter(lambda p: p.requires_grad, model.parameters()), lr=0.0001)
Sample Model
This code fine-tunes only the last layer of a ResNet18 model on a small dummy dataset with 3 classes.
PyTorch
import torch import torch.nn as nn import torch.optim as optim from torchvision import models # Load pre-trained model model = models.resnet18(pretrained=True) # Freeze all layers for param in model.parameters(): param.requires_grad = False # Replace the last layer for 3 classes num_classes = 3 model.fc = nn.Linear(model.fc.in_features, num_classes) # Only parameters of final layer will be updated optimizer = optim.Adam(model.fc.parameters(), lr=0.001) # Dummy input and target inputs = torch.randn(5, 3, 224, 224) targets = torch.tensor([0, 1, 2, 1, 0]) # Loss function criterion = nn.CrossEntropyLoss() # Training step model.train() optimizer.zero_grad() outputs = model(inputs) loss = criterion(outputs, targets) loss.backward() optimizer.step() # Print loss and predictions print(f"Loss: {loss.item():.4f}") _, preds = torch.max(outputs, 1) print(f"Predictions: {preds.tolist()}")
Important Notes
Freezing layers helps keep learned features and reduces training time.
Fine-tuning too many layers with little data can cause overfitting.
Adjust learning rate carefully; often smaller rates work better for fine-tuning.
Summary
Fine-tuning adjusts a pre-trained model to a new task by training some layers.
Freeze layers you want to keep unchanged by setting requires_grad = False.
Replace the last layer to match your new task's output classes and train only it or a few layers.
Practice
1. What is the main purpose of fine-tuning a pre-trained PyTorch model?
easy
Solution
Step 1: Understand fine-tuning concept
Fine-tuning means taking a model already trained on one task and adjusting it to work well on a new task by training some of its layers.Step 2: Compare options
Only To adjust the model to perform well on a new task by training some layers describes this process correctly. Other options describe unrelated actions.Final Answer:
To adjust the model to perform well on a new task by training some layers -> Option AQuick Check:
Fine-tuning = Adjust model layers for new task [OK]
Hint: Fine-tuning means training some layers for a new task [OK]
Common Mistakes:
- Thinking fine-tuning means training from scratch
- Confusing fine-tuning with model compression
- Assuming fine-tuning changes the whole model
2. Which PyTorch code snippet correctly freezes all layers except the last one for fine-tuning?
easy
Solution
Step 1: Understand freezing layers in PyTorch
Settingparam.requires_grad = Falsefreezes a layer so it won't update during training.Step 2: Analyze code snippets
for param in model.parameters(): param.requires_grad = False for param in model.fc.parameters(): param.requires_grad = True freezes all parameters first, then unfreezes only the last layer (model.fc). The other options reverse or misuse this logic or use non-existent methods.Final Answer:
for param in model.parameters(): param.requires_grad = False for param in model.fc.parameters(): param.requires_grad = True -> Option DQuick Check:
Freeze all, unfreeze last layer = for param in model.parameters(): param.requires_grad = False for param in model.fc.parameters(): param.requires_grad = True [OK]
Hint: Freeze all with requires_grad=False, then unfreeze last layer [OK]
Common Mistakes:
- Setting requires_grad True for all layers by mistake
- Using non-existent PyTorch methods
- Forgetting to unfreeze the last layer
3. Given this PyTorch code for fine-tuning, what will be the output of
print(sum(p.requires_grad for p in model.parameters()))?
for param in model.parameters():
param.requires_grad = False
for param in model.classifier.parameters():
param.requires_grad = True
print(sum(p.requires_grad for p in model.parameters()))medium
Solution
Step 1: Understand requires_grad flags
All parameters are first frozen (requires_grad=False). Then only parameters in model.classifier are unfrozen (requires_grad=True).Step 2: Calculate sum of requires_grad
Summingp.requires_gradcounts how many parameters are trainable. Since only model.classifier parameters are True, the sum equals their count.Final Answer:
Number of parameters in model.classifier -> Option BQuick Check:
Only classifier params require grad = Number of parameters in model.classifier [OK]
Hint: Sum requires_grad counts trainable parameters [OK]
Common Mistakes:
- Assuming all parameters are trainable
- Confusing boolean sum with total parameters
- Expecting an error from this code
4. You tried to fine-tune a model by freezing layers but the training loss does not change. What is the most likely error in your PyTorch code?
medium
Solution
Step 1: Analyze symptom - loss not changing
If loss stays the same, model parameters are not updating during training.Step 2: Check requires_grad flags
If all parameters haverequires_grad = False, gradients won't be computed and weights won't update, causing no loss change.Final Answer:
You did not set requires_grad = True for any parameters -> Option CQuick Check:
No trainable params = no loss change [OK]
Hint: Check requires_grad True for trainable layers [OK]
Common Mistakes:
- Assuming optimizer choice causes no loss change
- Forgetting to call model.train() but blaming loss
- Ignoring requires_grad flags
5. You want to fine-tune a pre-trained ResNet model on a 10-class problem. Which strategy is best to start with?
hard
Solution
Step 1: Understand common fine-tuning approach
Starting by freezing all layers except the last layer is a common strategy to adapt a pre-trained model to a new task efficiently.Step 2: Evaluate options
Freeze all layers, replace the final fully connected layer with 10 outputs, and train only this layer matches this approach: freeze all, replace last layer for 10 classes, train only last layer. Other options either train from scratch or do not freeze enough layers, which can be inefficient or unstable.Final Answer:
Freeze all layers, replace the final fully connected layer with 10 outputs, and train only this layer -> Option AQuick Check:
Freeze all but last layer for new task [OK]
Hint: Freeze all, replace last layer, train only it first [OK]
Common Mistakes:
- Training entire model from scratch unnecessarily
- Freezing too few layers causing slow training
- Not replacing last layer to match output classes
