How to Do Transfer Learning in PyTorch: Simple Guide
To do transfer learning in PyTorch, load a pretrained model using
torchvision.models, freeze its layers if needed, replace the final layer to match your task, and then train only the new layers. This lets you reuse learned features and adapt the model quickly to your data.Syntax
Transfer learning in PyTorch typically involves these steps:
- Load a pretrained model with
torchvision.models. - Freeze the pretrained layers by setting
requires_grad = False. - Replace the final layer(s) to fit your new task's output size.
- Train the new layers while keeping pretrained layers fixed or fine-tune all layers.
python
import torchvision.models as models import torch.nn as nn # Load pretrained model model = models.resnet18(pretrained=True) # Freeze pretrained layers for param in model.parameters(): param.requires_grad = False # Replace final layer for new task (e.g., 10 classes) num_features = model.fc.in_features model.fc = nn.Linear(num_features, 10) # Now only model.fc parameters will be trained
Example
This example shows transfer learning with ResNet18 on a dummy dataset with 10 classes. It freezes pretrained layers and trains only the new final layer.
python
import torch import torch.nn as nn import torch.optim as optim from torchvision import models from torch.utils.data import DataLoader, TensorDataset # Create dummy data: 100 samples, 3x224x224 images, 10 classes inputs = torch.randn(100, 3, 224, 224) targets = torch.randint(0, 10, (100,)) dataset = TensorDataset(inputs, targets) dataloader = DataLoader(dataset, batch_size=10) # Load pretrained ResNet18 model = models.resnet18(pretrained=True) # Freeze all layers for param in model.parameters(): param.requires_grad = False # Replace final layer num_features = model.fc.in_features model.fc = nn.Linear(num_features, 10) # Only parameters of final layer will be updated optimizer = optim.SGD(model.fc.parameters(), lr=0.01) criterion = nn.CrossEntropyLoss() # Training loop for 1 epoch model.train() for inputs_batch, targets_batch in dataloader: optimizer.zero_grad() outputs = model(inputs_batch) loss = criterion(outputs, targets_batch) loss.backward() optimizer.step() print(f"Training loss after 1 epoch: {loss.item():.4f}")
Output
Training loss after 1 epoch: 2.3026
Common Pitfalls
Common mistakes when doing transfer learning in PyTorch include:
- Not freezing pretrained layers, causing slow training and overfitting.
- Forgetting to replace the final layer to match your task's output size.
- Trying to train all layers without adjusting learning rates, which can harm pretrained weights.
- Not setting the model to
train()mode during training oreval()during evaluation.
python
import torchvision.models as models import torch.nn as nn # Wrong: Not freezing layers model = models.resnet18(pretrained=True) num_features = model.fc.in_features model.fc = nn.Linear(num_features, 10) # All layers will be trained, which may be slow and cause overfitting # Right: Freeze layers for param in model.parameters(): param.requires_grad = False model.fc = nn.Linear(num_features, 10) # Now only final layer trains
Quick Reference
Tips for transfer learning in PyTorch:
- Use
pretrained=Trueto load models with learned weights. - Freeze layers by setting
param.requires_grad = False. - Replace the final layer to match your number of classes.
- Train only new layers first, then optionally fine-tune all layers with a smaller learning rate.
- Use
model.train()andmodel.eval()modes correctly.
Key Takeaways
Load pretrained models from torchvision and replace their final layer for your task.
Freeze pretrained layers to keep learned features and speed up training.
Train only the new layers first, then optionally fine-tune the whole model.
Always set the model to train or eval mode appropriately.
Adjust learning rates when fine-tuning to avoid destroying pretrained weights.