How to do transfer learning pytorch

PytorchHow-ToBeginner · 3 min read

How to Do Transfer Learning in PyTorch: Simple Guide

To do transfer learning in PyTorch, load a pretrained model using torchvision.models, freeze its layers if needed, replace the final layer to match your task, and then train only the new layers. This lets you reuse learned features and adapt the model quickly to your data.

📐

Syntax

Transfer learning in PyTorch typically involves these steps:

Load a pretrained model with torchvision.models.
Freeze the pretrained layers by setting requires_grad = False.
Replace the final layer(s) to fit your new task's output size.
Train the new layers while keeping pretrained layers fixed or fine-tune all layers.

python

import torchvision.models as models
import torch.nn as nn

# Load pretrained model
model = models.resnet18(pretrained=True)

# Freeze pretrained layers
for param in model.parameters():
    param.requires_grad = False

# Replace final layer for new task (e.g., 10 classes)
num_features = model.fc.in_features
model.fc = nn.Linear(num_features, 10)

# Now only model.fc parameters will be trained

💻

Example

This example shows transfer learning with ResNet18 on a dummy dataset with 10 classes. It freezes pretrained layers and trains only the new final layer.

python

import torch
import torch.nn as nn
import torch.optim as optim
from torchvision import models
from torch.utils.data import DataLoader, TensorDataset

# Create dummy data: 100 samples, 3x224x224 images, 10 classes
inputs = torch.randn(100, 3, 224, 224)
targets = torch.randint(0, 10, (100,))
dataset = TensorDataset(inputs, targets)
dataloader = DataLoader(dataset, batch_size=10)

# Load pretrained ResNet18
model = models.resnet18(pretrained=True)

# Freeze all layers
for param in model.parameters():
    param.requires_grad = False

# Replace final layer
num_features = model.fc.in_features
model.fc = nn.Linear(num_features, 10)

# Only parameters of final layer will be updated
optimizer = optim.SGD(model.fc.parameters(), lr=0.01)
criterion = nn.CrossEntropyLoss()

# Training loop for 1 epoch
model.train()
for inputs_batch, targets_batch in dataloader:
    optimizer.zero_grad()
    outputs = model(inputs_batch)
    loss = criterion(outputs, targets_batch)
    loss.backward()
    optimizer.step()

print(f"Training loss after 1 epoch: {loss.item():.4f}")

Output

Training loss after 1 epoch: 2.3026

⚠️

Common Pitfalls

Common mistakes when doing transfer learning in PyTorch include:

Not freezing pretrained layers, causing slow training and overfitting.
Forgetting to replace the final layer to match your task's output size.
Trying to train all layers without adjusting learning rates, which can harm pretrained weights.
Not setting the model to train() mode during training or eval() during evaluation.

python

import torchvision.models as models
import torch.nn as nn

# Wrong: Not freezing layers
model = models.resnet18(pretrained=True)
num_features = model.fc.in_features
model.fc = nn.Linear(num_features, 10)
# All layers will be trained, which may be slow and cause overfitting

# Right: Freeze layers
for param in model.parameters():
    param.requires_grad = False
model.fc = nn.Linear(num_features, 10)
# Now only final layer trains

📊

Quick Reference

Tips for transfer learning in PyTorch:

Use pretrained=True to load models with learned weights.
Freeze layers by setting param.requires_grad = False.
Replace the final layer to match your number of classes.
Train only new layers first, then optionally fine-tune all layers with a smaller learning rate.
Use model.train() and model.eval() modes correctly.

✅

Key Takeaways

Load pretrained models from torchvision and replace their final layer for your task.

Freeze pretrained layers to keep learned features and speed up training.

Train only the new layers first, then optionally fine-tune the whole model.

Always set the model to train or eval mode appropriately.

Adjust learning rates when fine-tuning to avoid destroying pretrained weights.