How to use transfer learning pytorch

PytorchHow-ToBeginner · 4 min read

How to Use Transfer Learning in PyTorch: Simple Guide

To use transfer learning in PyTorch, load a pretrained model from torchvision.models, freeze its early layers to keep learned features, and replace the final layer to match your task. Then, train only the new layers or fine-tune the whole model with your dataset.

📐

Syntax

Transfer learning in PyTorch typically involves these steps:

Load a pretrained model from torchvision.models.
Freeze layers by setting param.requires_grad = False to keep learned features.
Replace the final classification layer to fit your number of classes.
Define optimizer and loss, then train the model.

python

import torch
import torchvision.models as models

# Load pretrained model
model = models.resnet18(pretrained=True)

# Freeze all layers
for param in model.parameters():
    param.requires_grad = False

# Replace the final layer
num_features = model.fc.in_features
num_classes = 10  # num_classes is your target count
model.fc = torch.nn.Linear(num_features, num_classes)

# Only parameters of final layer will be updated
optimizer = torch.optim.SGD(model.fc.parameters(), lr=0.001, momentum=0.9)

# Define loss
criterion = torch.nn.CrossEntropyLoss()

💻

Example

This example shows how to use transfer learning with ResNet18 on a custom dataset with 2 classes. It freezes pretrained layers and trains only the final layer.

python

import torch
import torchvision.models as models
import torch.nn as nn
import torch.optim as optim
from torchvision import datasets, transforms
from torch.utils.data import DataLoader

# Setup device
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

# Data transforms
transform = transforms.Compose([
    transforms.Resize((224, 224)),
    transforms.ToTensor(),
    transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
])

# Load example dataset (replace with your own)
dataset = datasets.FakeData(transform=transform)
dataloader = DataLoader(dataset, batch_size=4, shuffle=True)

# Load pretrained ResNet18
model = models.resnet18(pretrained=True)

# Freeze all layers
for param in model.parameters():
    param.requires_grad = False

# Replace final layer for 2 classes
num_features = model.fc.in_features
model.fc = nn.Linear(num_features, 2)
model = model.to(device)

# Only train final layer
optimizer = optim.SGD(model.fc.parameters(), lr=0.001, momentum=0.9)
criterion = nn.CrossEntropyLoss()

# Training loop (1 epoch for demo)
model.train()
for inputs, labels in dataloader:
    inputs, labels = inputs.to(device), labels.to(device)
    optimizer.zero_grad()
    outputs = model(inputs)
    loss = criterion(outputs, labels)
    loss.backward()
    optimizer.step()
    print(f'Loss: {loss.item():.4f}')
    break  # run one batch for demo

Output

Loss: 0.6931

⚠️

Common Pitfalls

Common mistakes when using transfer learning in PyTorch include:

Not freezing pretrained layers, causing slow training and overfitting.
Forgetting to replace the final layer to match your number of classes.
Passing all model parameters to the optimizer instead of only trainable ones.
Not normalizing input images with the same mean and std used in pretrained models.

python

import torchvision.models as models

model = models.resnet18(pretrained=True)

# Wrong: Not freezing layers
# optimizer = torch.optim.SGD(model.parameters(), lr=0.001)  # trains all layers

# Right: Freeze layers
for param in model.parameters():
    param.requires_grad = False

num_features = model.fc.in_features
model.fc = torch.nn.Linear(num_features, 10)  # example 10 classes

# Only train final layer
optimizer = torch.optim.SGD(model.fc.parameters(), lr=0.001)

📊

Quick Reference

Summary tips for transfer learning in PyTorch:

Use torchvision.models to get pretrained models.
Freeze early layers by setting param.requires_grad = False.
Replace the final layer to match your task's classes.
Normalize inputs with pretrained model's mean and std.
Train only the new layers or fine-tune by unfreezing some layers later.

✅

Key Takeaways

Load a pretrained model and freeze its layers to keep learned features.

Replace the final layer to fit your number of output classes.

Train only the new layers initially for faster and stable training.

Normalize input images with the pretrained model's expected mean and std.

Fine-tune by unfreezing layers later if higher accuracy is needed.