Jump into concepts and practice - no test required
or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Recall & Review
beginner
What is fine-tuning in machine learning?
Fine-tuning is the process of taking a pre-trained model and training it a bit more on a new, often smaller, dataset to adapt it to a specific task.
Click to reveal answer
beginner
Why do we freeze some layers during fine-tuning?
We freeze layers to keep their learned features unchanged, which helps prevent overfitting and reduces training time by only updating certain parts of the model.
Click to reveal answer
intermediate
In PyTorch, how do you freeze layers of a model?
You set the parameter's requires_grad attribute to False, like: for param in model.parameters(): param.requires_grad = False
Click to reveal answer
intermediate
What is a common strategy to fine-tune a pre-trained model?
First, freeze most layers and train only the last layers. Then, optionally unfreeze some earlier layers and train with a smaller learning rate.
Click to reveal answer
beginner
How does using a smaller learning rate help during fine-tuning?
A smaller learning rate helps make small adjustments to the pre-trained weights, avoiding large changes that could ruin the learned features.
Click to reveal answer
What does freezing layers in a model mean?
AStopping updates to those layers during training
BRemoving those layers from the model
CAdding more neurons to those layers
DChanging the activation function of those layers
✗ Incorrect
Freezing layers means their weights do not get updated during training.
Why start fine-tuning by training only the last layers?
ABecause last layers have fewer parameters
BBecause last layers do not affect output
CBecause last layers are always frozen
DBecause last layers are usually task-specific
✗ Incorrect
Last layers capture task-specific features, so training them first adapts the model quickly.
In PyTorch, which attribute controls if a parameter is trainable?
Agrad_enabled
Btrainable
Crequires_grad
Dupdate_flag
✗ Incorrect
The requires_grad attribute controls if gradients are computed and parameters updated.
What is a benefit of using a pre-trained model for fine-tuning?
AIt guarantees perfect accuracy
BIt reduces training time and data needed
CIt removes the need for validation
DIt makes the model smaller
✗ Incorrect
Pre-trained models have learned useful features, so fine-tuning needs less data and time.
What happens if you use a large learning rate during fine-tuning?
AThe model might forget learned features
BThe model trains faster without issues
CThe model becomes smaller
DThe model ignores new data
✗ Incorrect
A large learning rate can cause big changes that ruin the pre-trained knowledge.
Explain the step-by-step process of fine-tuning a pre-trained model in PyTorch.
Think about which layers to freeze and how to adjust learning rates.
You got /5 concepts.
Describe why fine-tuning is useful compared to training a model from scratch.
Consider the benefits of transfer learning.
You got /4 concepts.
Practice
(1/5)
1. What is the main purpose of fine-tuning a pre-trained PyTorch model?
easy
A. To adjust the model to perform well on a new task by training some layers
B. To train the model from scratch on a large dataset
C. To reduce the model size by removing layers
D. To convert the model to a different programming language
Solution
Step 1: Understand fine-tuning concept
Fine-tuning means taking a model already trained on one task and adjusting it to work well on a new task by training some of its layers.
Step 2: Compare options
Only To adjust the model to perform well on a new task by training some layers describes this process correctly. Other options describe unrelated actions.
Final Answer:
To adjust the model to perform well on a new task by training some layers -> Option A
Quick Check:
Fine-tuning = Adjust model layers for new task [OK]
Hint: Fine-tuning means training some layers for a new task [OK]
Common Mistakes:
Thinking fine-tuning means training from scratch
Confusing fine-tuning with model compression
Assuming fine-tuning changes the whole model
2. Which PyTorch code snippet correctly freezes all layers except the last one for fine-tuning?
easy
A. model.freeze_all_layers()
model.unfreeze_last_layer()
B. for param in model.parameters(): param.requires_grad = True
for param in model.fc.parameters(): param.requires_grad = False
C. model.requires_grad = False
model.fc.requires_grad = True
D. for param in model.parameters(): param.requires_grad = False
for param in model.fc.parameters(): param.requires_grad = True
Solution
Step 1: Understand freezing layers in PyTorch
Setting param.requires_grad = False freezes a layer so it won't update during training.
Step 2: Analyze code snippets
for param in model.parameters(): param.requires_grad = False
for param in model.fc.parameters(): param.requires_grad = True freezes all parameters first, then unfreezes only the last layer (model.fc). The other options reverse or misuse this logic or use non-existent methods.
Final Answer:
for param in model.parameters(): param.requires_grad = False
for param in model.fc.parameters(): param.requires_grad = True -> Option D
Quick Check:
Freeze all, unfreeze last layer = for param in model.parameters(): param.requires_grad = False
for param in model.fc.parameters(): param.requires_grad = True [OK]
Hint: Freeze all with requires_grad=False, then unfreeze last layer [OK]
Common Mistakes:
Setting requires_grad True for all layers by mistake
Using non-existent PyTorch methods
Forgetting to unfreeze the last layer
3. Given this PyTorch code for fine-tuning, what will be the output of print(sum(p.requires_grad for p in model.parameters()))?
for param in model.parameters():
param.requires_grad = False
for param in model.classifier.parameters():
param.requires_grad = True
print(sum(p.requires_grad for p in model.parameters()))
medium
A. Number of all model parameters
B. Number of parameters in model.classifier
C. Zero
D. Raises an error
Solution
Step 1: Understand requires_grad flags
All parameters are first frozen (requires_grad=False). Then only parameters in model.classifier are unfrozen (requires_grad=True).
Step 2: Calculate sum of requires_grad
Summing p.requires_grad counts how many parameters are trainable. Since only model.classifier parameters are True, the sum equals their count.
Final Answer:
Number of parameters in model.classifier -> Option B
Quick Check:
Only classifier params require grad = Number of parameters in model.classifier [OK]
Hint: Sum requires_grad counts trainable parameters [OK]
Common Mistakes:
Assuming all parameters are trainable
Confusing boolean sum with total parameters
Expecting an error from this code
4. You tried to fine-tune a model by freezing layers but the training loss does not change. What is the most likely error in your PyTorch code?
medium
A. You used the wrong optimizer
B. You forgot to set model.train() before training
C. You did not set requires_grad = True for any parameters
D. You replaced the last layer with wrong output size
Solution
Step 1: Analyze symptom - loss not changing
If loss stays the same, model parameters are not updating during training.
Step 2: Check requires_grad flags
If all parameters have requires_grad = False, gradients won't be computed and weights won't update, causing no loss change.
Final Answer:
You did not set requires_grad = True for any parameters -> Option C
Quick Check:
No trainable params = no loss change [OK]
Hint: Check requires_grad True for trainable layers [OK]
Common Mistakes:
Assuming optimizer choice causes no loss change
Forgetting to call model.train() but blaming loss
Ignoring requires_grad flags
5. You want to fine-tune a pre-trained ResNet model on a 10-class problem. Which strategy is best to start with?
hard
A. Freeze all layers, replace the final fully connected layer with 10 outputs, and train only this layer
B. Train the entire ResNet model from scratch with 10 output classes
C. Freeze only the first convolutional layer and train the rest
D. Replace the final layer but keep all layers trainable without freezing
Solution
Step 1: Understand common fine-tuning approach
Starting by freezing all layers except the last layer is a common strategy to adapt a pre-trained model to a new task efficiently.
Step 2: Evaluate options
Freeze all layers, replace the final fully connected layer with 10 outputs, and train only this layer matches this approach: freeze all, replace last layer for 10 classes, train only last layer. Other options either train from scratch or do not freeze enough layers, which can be inefficient or unstable.
Final Answer:
Freeze all layers, replace the final fully connected layer with 10 outputs, and train only this layer -> Option A
Quick Check:
Freeze all but last layer for new task [OK]
Hint: Freeze all, replace last layer, train only it first [OK]