Fine-tuning helps a pre-trained model learn new tasks faster by adjusting it slightly instead of starting from scratch.
0
0
Fine-tuning strategy in PyTorch
Introduction
You want to teach a model to recognize new types of images but have limited data.
You have a language model trained on general text and want it to work well on medical documents.
You want to improve a speech recognition model for a specific accent.
You want to save time and computing power by building on an existing model.
You want to customize a chatbot to understand your company's products better.
Syntax
PyTorch
import torchvision.models as models model = models.resnet18(pretrained=True) for param in model.parameters(): param.requires_grad = False model.fc = torch.nn.Linear(model.fc.in_features, num_classes) optimizer = torch.optim.Adam(model.fc.parameters(), lr=0.001) # Training loop only updates model.fc parameters
Set requires_grad = False to freeze layers you don't want to change.
Replace the last layer to match your new task's output classes.
Examples
Freeze all layers except the last fully connected layer for a 10-class problem.
PyTorch
import torchvision.models as models model = models.resnet50(pretrained=True) for param in model.parameters(): param.requires_grad = False model.fc = torch.nn.Linear(model.fc.in_features, 10) # 10 classes optimizer = torch.optim.SGD(model.fc.parameters(), lr=0.01)
Fine-tune only the last block of layers (layer4) while freezing the rest.
PyTorch
for name, param in model.named_parameters(): if 'layer4' in name: param.requires_grad = True else: param.requires_grad = False
Only optimize parameters that require gradients (unfrozen layers).
PyTorch
optimizer = torch.optim.Adam(filter(lambda p: p.requires_grad, model.parameters()), lr=0.0001)
Sample Model
This code fine-tunes only the last layer of a ResNet18 model on a small dummy dataset with 3 classes.
PyTorch
import torch import torch.nn as nn import torch.optim as optim from torchvision import models # Load pre-trained model model = models.resnet18(pretrained=True) # Freeze all layers for param in model.parameters(): param.requires_grad = False # Replace the last layer for 3 classes num_classes = 3 model.fc = nn.Linear(model.fc.in_features, num_classes) # Only parameters of final layer will be updated optimizer = optim.Adam(model.fc.parameters(), lr=0.001) # Dummy input and target inputs = torch.randn(5, 3, 224, 224) targets = torch.tensor([0, 1, 2, 1, 0]) # Loss function criterion = nn.CrossEntropyLoss() # Training step model.train() optimizer.zero_grad() outputs = model(inputs) loss = criterion(outputs, targets) loss.backward() optimizer.step() # Print loss and predictions print(f"Loss: {loss.item():.4f}") _, preds = torch.max(outputs, 1) print(f"Predictions: {preds.tolist()}")
OutputSuccess
Important Notes
Freezing layers helps keep learned features and reduces training time.
Fine-tuning too many layers with little data can cause overfitting.
Adjust learning rate carefully; often smaller rates work better for fine-tuning.
Summary
Fine-tuning adjusts a pre-trained model to a new task by training some layers.
Freeze layers you want to keep unchanged by setting requires_grad = False.
Replace the last layer to match your new task's output classes and train only it or a few layers.