How to fine tune model pytorch

PytorchHow-ToBeginner · 4 min read

How to Fine Tune a Model in PyTorch: Step-by-Step Guide

To fine tune a model in PyTorch, load a pretrained model using torchvision.models, freeze its early layers by setting requires_grad=False, replace the final layer to match your task, then train only the unfrozen layers with your dataset using an optimizer and loss function.

📐

Syntax

Fine tuning in PyTorch involves these key steps:

Load pretrained model: Use torchvision.models or other sources.
Freeze layers: Set param.requires_grad = False for layers you don't want to update.
Modify final layer: Replace the last layer to fit your output classes.
Define optimizer: Pass only parameters with requires_grad=True.
Train: Run training loop updating only unfrozen layers.

python

import torch
import torchvision.models as models

# Load pretrained model
model = models.resnet18(pretrained=True)

# Freeze all layers
for param in model.parameters():
    param.requires_grad = False

# Replace final layer for 10 classes
model.fc = torch.nn.Linear(model.fc.in_features, 10)

# Only parameters of final layer will be updated
optimizer = torch.optim.SGD(model.fc.parameters(), lr=0.001)

# Loss function
criterion = torch.nn.CrossEntropyLoss()

💻

Example

This example fine tunes a pretrained ResNet18 on a dummy dataset with 10 classes. It freezes all layers except the final fully connected layer, then trains for one epoch.

python

import torch
import torchvision.models as models
from torch.utils.data import DataLoader, TensorDataset

# Create dummy dataset
inputs = torch.randn(20, 3, 224, 224)  # 20 images
labels = torch.randint(0, 10, (20,))    # 10 classes

dataset = TensorDataset(inputs, labels)
dataloader = DataLoader(dataset, batch_size=5)

# Load pretrained model
model = models.resnet18(pretrained=True)

# Freeze all layers
for param in model.parameters():
    param.requires_grad = False

# Replace final layer
model.fc = torch.nn.Linear(model.fc.in_features, 10)

# Only final layer params
optimizer = torch.optim.SGD(model.fc.parameters(), lr=0.01)
criterion = torch.nn.CrossEntropyLoss()

model.train()
for inputs_batch, labels_batch in dataloader:
    optimizer.zero_grad()
    outputs = model(inputs_batch)
    loss = criterion(outputs, labels_batch)
    loss.backward()
    optimizer.step()
    print(f"Loss: {loss.item():.4f}")

Output

Loss: 2.3026 Loss: 2.3024 Loss: 2.3023 Loss: 2.3021

⚠️

Common Pitfalls

Not freezing layers: Forgetting to set requires_grad=False causes all layers to train, which is slow and may overfit.
Wrong optimizer params: Passing all model parameters to optimizer instead of only unfrozen ones wastes resources.
Incorrect final layer size: Not matching output layer size to your number of classes causes shape errors.
Forgetting model.train(): Not setting model to training mode disables dropout/batchnorm updates.

python

import torch
import torchvision.models as models

model = models.resnet18(pretrained=True)

# WRONG: Not freezing layers
# optimizer = torch.optim.SGD(model.parameters(), lr=0.01)  # Trains all layers

# CORRECT: Freeze layers and train only final
for param in model.parameters():
    param.requires_grad = False
model.fc = torch.nn.Linear(model.fc.in_features, 10)
optimizer = torch.optim.SGD(model.fc.parameters(), lr=0.01)

📊

Quick Reference

Summary tips for fine tuning in PyTorch:

Load pretrained model with models.resnet18(pretrained=True).
Freeze layers by setting param.requires_grad = False.
Replace final layer to match your task output size.
Use optimizer on only unfrozen parameters.
Train with your dataset and loss function.

✅

Key Takeaways

Freeze pretrained layers by setting requires_grad=False to avoid updating them during training.

Replace the model's final layer to match your specific task's number of output classes.

Use an optimizer that updates only the unfrozen parameters to save computation.

Always set the model to train mode with model.train() before training.

Fine tuning lets you adapt powerful pretrained models efficiently to new tasks.