How to Fine Tune a Model in PyTorch: Step-by-Step Guide
To fine tune a model in PyTorch, load a pretrained model using
torchvision.models, freeze its early layers by setting requires_grad=False, replace the final layer to match your task, then train only the unfrozen layers with your dataset using an optimizer and loss function.Syntax
Fine tuning in PyTorch involves these key steps:
- Load pretrained model: Use
torchvision.modelsor other sources. - Freeze layers: Set
param.requires_grad = Falsefor layers you don't want to update. - Modify final layer: Replace the last layer to fit your output classes.
- Define optimizer: Pass only parameters with
requires_grad=True. - Train: Run training loop updating only unfrozen layers.
python
import torch import torchvision.models as models # Load pretrained model model = models.resnet18(pretrained=True) # Freeze all layers for param in model.parameters(): param.requires_grad = False # Replace final layer for 10 classes model.fc = torch.nn.Linear(model.fc.in_features, 10) # Only parameters of final layer will be updated optimizer = torch.optim.SGD(model.fc.parameters(), lr=0.001) # Loss function criterion = torch.nn.CrossEntropyLoss()
Example
This example fine tunes a pretrained ResNet18 on a dummy dataset with 10 classes. It freezes all layers except the final fully connected layer, then trains for one epoch.
python
import torch import torchvision.models as models from torch.utils.data import DataLoader, TensorDataset # Create dummy dataset inputs = torch.randn(20, 3, 224, 224) # 20 images labels = torch.randint(0, 10, (20,)) # 10 classes dataset = TensorDataset(inputs, labels) dataloader = DataLoader(dataset, batch_size=5) # Load pretrained model model = models.resnet18(pretrained=True) # Freeze all layers for param in model.parameters(): param.requires_grad = False # Replace final layer model.fc = torch.nn.Linear(model.fc.in_features, 10) # Only final layer params optimizer = torch.optim.SGD(model.fc.parameters(), lr=0.01) criterion = torch.nn.CrossEntropyLoss() model.train() for inputs_batch, labels_batch in dataloader: optimizer.zero_grad() outputs = model(inputs_batch) loss = criterion(outputs, labels_batch) loss.backward() optimizer.step() print(f"Loss: {loss.item():.4f}")
Output
Loss: 2.3026
Loss: 2.3024
Loss: 2.3023
Loss: 2.3021
Common Pitfalls
- Not freezing layers: Forgetting to set
requires_grad=Falsecauses all layers to train, which is slow and may overfit. - Wrong optimizer params: Passing all model parameters to optimizer instead of only unfrozen ones wastes resources.
- Incorrect final layer size: Not matching output layer size to your number of classes causes shape errors.
- Forgetting model.train(): Not setting model to training mode disables dropout/batchnorm updates.
python
import torch import torchvision.models as models model = models.resnet18(pretrained=True) # WRONG: Not freezing layers # optimizer = torch.optim.SGD(model.parameters(), lr=0.01) # Trains all layers # CORRECT: Freeze layers and train only final for param in model.parameters(): param.requires_grad = False model.fc = torch.nn.Linear(model.fc.in_features, 10) optimizer = torch.optim.SGD(model.fc.parameters(), lr=0.01)
Quick Reference
Summary tips for fine tuning in PyTorch:
- Load pretrained model with
models.resnet18(pretrained=True). - Freeze layers by setting
param.requires_grad = False. - Replace final layer to match your task output size.
- Use optimizer on only unfrozen parameters.
- Train with your dataset and loss function.
Key Takeaways
Freeze pretrained layers by setting requires_grad=False to avoid updating them during training.
Replace the model's final layer to match your specific task's number of output classes.
Use an optimizer that updates only the unfrozen parameters to save computation.
Always set the model to train mode with model.train() before training.
Fine tuning lets you adapt powerful pretrained models efficiently to new tasks.