When fine-tuning a model, the key metrics depend on the task. For classification, accuracy, precision, recall, and F1 score are important to check if the model learned well on new data. For regression, mean squared error or R-squared matter. Fine-tuning aims to improve performance on a specific task without losing general knowledge, so monitoring validation loss and validation metrics helps detect if the model is improving or overfitting.
Fine-tuning strategy in PyTorch - Model Metrics & Evaluation
Start learning this pattern below
Jump into concepts and practice - no test required
or
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Metrics & Evaluation - Fine-tuning strategy
Which metric matters for Fine-tuning strategy and WHY
Confusion matrix example for fine-tuned classification model
| Predicted Positive | Predicted Negative |
|--------------------|--------------------|
| True Positive (TP) | False Negative (FN) |
| False Positive (FP) | True Negative (TN) |
Example:
TP = 85, FP = 15, TN = 90, FN = 10
Total samples = 85 + 15 + 90 + 10 = 200
From this, we calculate:
- Precision = TP / (TP + FP) = 85 / (85 + 15) = 0.85
- Recall = TP / (TP + FN) = 85 / (85 + 10) = 0.8947
- F1 Score = 2 * (Precision * Recall) / (Precision + Recall) ≈ 0.871
Precision vs Recall tradeoff in Fine-tuning
Fine-tuning can shift the balance between precision and recall. For example:
- If you fine-tune a spam detector, you want high precision to avoid marking good emails as spam.
- If you fine-tune a medical diagnosis model, you want high recall to catch as many true cases as possible, even if some false alarms occur.
Choosing which metric to prioritize depends on the real-world cost of mistakes.
Good vs Bad metric values for Fine-tuning
Good fine-tuning results show:
- Validation loss decreases steadily.
- Accuracy, precision, recall, and F1 improve compared to the base model.
- Balanced precision and recall if both matter.
Bad fine-tuning results show:
- Validation loss plateaus or increases (overfitting).
- Metrics on validation data do not improve or get worse.
- Very high precision but very low recall or vice versa without justification.
Common pitfalls in Fine-tuning metrics
- Accuracy paradox: High accuracy can be misleading if classes are imbalanced.
- Data leakage: Using test data during fine-tuning inflates metrics falsely.
- Overfitting: Metrics improve on training but worsen on validation.
- Ignoring metric tradeoffs: Focusing only on accuracy without checking precision and recall.
- Not monitoring validation metrics: Only training metrics can hide poor generalization.
Self-check question
Your fine-tuned model has 98% accuracy but only 12% recall on the fraud class. Is it good for production? Why or why not?
Answer: No, it is not good. The low recall means the model misses most fraud cases, which is critical in fraud detection. High accuracy is misleading because fraud is rare, so the model mostly predicts non-fraud correctly but fails to catch fraud. You should improve recall even if accuracy drops a bit.
Key Result
Fine-tuning success is best judged by balanced improvements in validation loss, precision, recall, and F1 score relevant to the task.
Practice
1. What is the main purpose of fine-tuning a pre-trained PyTorch model?
easy
Solution
Step 1: Understand fine-tuning concept
Fine-tuning means taking a model already trained on one task and adjusting it to work well on a new task by training some of its layers.Step 2: Compare options
Only To adjust the model to perform well on a new task by training some layers describes this process correctly. Other options describe unrelated actions.Final Answer:
To adjust the model to perform well on a new task by training some layers -> Option AQuick Check:
Fine-tuning = Adjust model layers for new task [OK]
Hint: Fine-tuning means training some layers for a new task [OK]
Common Mistakes:
- Thinking fine-tuning means training from scratch
- Confusing fine-tuning with model compression
- Assuming fine-tuning changes the whole model
2. Which PyTorch code snippet correctly freezes all layers except the last one for fine-tuning?
easy
Solution
Step 1: Understand freezing layers in PyTorch
Settingparam.requires_grad = Falsefreezes a layer so it won't update during training.Step 2: Analyze code snippets
for param in model.parameters(): param.requires_grad = False for param in model.fc.parameters(): param.requires_grad = True freezes all parameters first, then unfreezes only the last layer (model.fc). The other options reverse or misuse this logic or use non-existent methods.Final Answer:
for param in model.parameters(): param.requires_grad = False for param in model.fc.parameters(): param.requires_grad = True -> Option DQuick Check:
Freeze all, unfreeze last layer = for param in model.parameters(): param.requires_grad = False for param in model.fc.parameters(): param.requires_grad = True [OK]
Hint: Freeze all with requires_grad=False, then unfreeze last layer [OK]
Common Mistakes:
- Setting requires_grad True for all layers by mistake
- Using non-existent PyTorch methods
- Forgetting to unfreeze the last layer
3. Given this PyTorch code for fine-tuning, what will be the output of
print(sum(p.requires_grad for p in model.parameters()))?
for param in model.parameters():
param.requires_grad = False
for param in model.classifier.parameters():
param.requires_grad = True
print(sum(p.requires_grad for p in model.parameters()))medium
Solution
Step 1: Understand requires_grad flags
All parameters are first frozen (requires_grad=False). Then only parameters in model.classifier are unfrozen (requires_grad=True).Step 2: Calculate sum of requires_grad
Summingp.requires_gradcounts how many parameters are trainable. Since only model.classifier parameters are True, the sum equals their count.Final Answer:
Number of parameters in model.classifier -> Option BQuick Check:
Only classifier params require grad = Number of parameters in model.classifier [OK]
Hint: Sum requires_grad counts trainable parameters [OK]
Common Mistakes:
- Assuming all parameters are trainable
- Confusing boolean sum with total parameters
- Expecting an error from this code
4. You tried to fine-tune a model by freezing layers but the training loss does not change. What is the most likely error in your PyTorch code?
medium
Solution
Step 1: Analyze symptom - loss not changing
If loss stays the same, model parameters are not updating during training.Step 2: Check requires_grad flags
If all parameters haverequires_grad = False, gradients won't be computed and weights won't update, causing no loss change.Final Answer:
You did not set requires_grad = True for any parameters -> Option CQuick Check:
No trainable params = no loss change [OK]
Hint: Check requires_grad True for trainable layers [OK]
Common Mistakes:
- Assuming optimizer choice causes no loss change
- Forgetting to call model.train() but blaming loss
- Ignoring requires_grad flags
5. You want to fine-tune a pre-trained ResNet model on a 10-class problem. Which strategy is best to start with?
hard
Solution
Step 1: Understand common fine-tuning approach
Starting by freezing all layers except the last layer is a common strategy to adapt a pre-trained model to a new task efficiently.Step 2: Evaluate options
Freeze all layers, replace the final fully connected layer with 10 outputs, and train only this layer matches this approach: freeze all, replace last layer for 10 classes, train only last layer. Other options either train from scratch or do not freeze enough layers, which can be inefficient or unstable.Final Answer:
Freeze all layers, replace the final fully connected layer with 10 outputs, and train only this layer -> Option AQuick Check:
Freeze all but last layer for new task [OK]
Hint: Freeze all, replace last layer, train only it first [OK]
Common Mistakes:
- Training entire model from scratch unnecessarily
- Freezing too few layers causing slow training
- Not replacing last layer to match output classes
