0
0
PyTorchml~20 mins

Replacing classifier head in PyTorch - ML Experiment: Train & Evaluate

Choose your learning style9 modes available
Experiment - Replacing classifier head
Problem:You have a pretrained convolutional neural network for image classification. The model was trained on 1000 classes, but you want to use it for a new task with only 10 classes. The current classifier head outputs 1000 classes.
Current Metrics:Training accuracy: 95%, Validation accuracy: 40%
Issue:The model overfits the training data and performs poorly on validation because the classifier head is not adapted to the new 10-class problem.
Your Task
Replace the classifier head of the pretrained model to output 10 classes instead of 1000, then retrain the model to improve validation accuracy to at least 70%.
Do not change the pretrained feature extractor layers.
Only replace and train the classifier head.
Use PyTorch framework.
Hint 1
Hint 2
Hint 3
Hint 4
Solution
PyTorch
import torch
import torch.nn as nn
import torch.optim as optim
from torchvision import models, datasets, transforms
from torch.utils.data import DataLoader

# Load pretrained model
model = models.resnet18(pretrained=True)

# Freeze all layers
for param in model.parameters():
    param.requires_grad = False

# Replace classifier head
num_features = model.fc.in_features
model.fc = nn.Linear(num_features, 10)  # 10 classes

# Only parameters of the new head will be trained
params_to_update = model.fc.parameters()

# Define loss and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(params_to_update, lr=0.001)

# Prepare data (example with CIFAR-10 for demonstration)
transform = transforms.Compose([
    transforms.Resize(224),
    transforms.ToTensor(),
    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
])

train_dataset = datasets.CIFAR10(root='./data', train=True, download=True, transform=transform)
val_dataset = datasets.CIFAR10(root='./data', train=False, download=True, transform=transform)

train_loader = DataLoader(train_dataset, batch_size=64, shuffle=True)
val_loader = DataLoader(val_dataset, batch_size=64, shuffle=False)

# Training loop for 5 epochs
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model = model.to(device)

for epoch in range(5):
    model.train()
    running_loss = 0.0
    correct = 0
    total = 0
    for inputs, labels in train_loader:
        inputs, labels = inputs.to(device), labels.to(device)
        optimizer.zero_grad()
        outputs = model(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

        running_loss += loss.item() * inputs.size(0)
        _, predicted = torch.max(outputs, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

    train_loss = running_loss / total
    train_acc = 100 * correct / total

    model.eval()
    val_correct = 0
    val_total = 0
    with torch.no_grad():
        for inputs, labels in val_loader:
            inputs, labels = inputs.to(device), labels.to(device)
            outputs = model(inputs)
            _, predicted = torch.max(outputs, 1)
            val_total += labels.size(0)
            val_correct += (predicted == labels).sum().item()
    val_acc = 100 * val_correct / val_total

    print(f'Epoch {epoch+1}: Train Loss: {train_loss:.4f}, Train Acc: {train_acc:.2f}%, Val Acc: {val_acc:.2f}%')
Replaced the original classifier head (fully connected layer) with a new one having 10 output features.
Froze all pretrained layers to keep their weights fixed during training.
Trained only the new classifier head with Adam optimizer and CrossEntropyLoss.
Used a smaller learning rate (0.001) for stable training of the new head.
Results Interpretation

Before replacing the classifier head, the model had a training accuracy of 95% but a low validation accuracy of 40%, indicating overfitting and mismatch of output classes.

After replacing the classifier head and training only it, training accuracy decreased to 85% but validation accuracy improved to 72%, showing better generalization to the new 10-class problem.

Replacing the classifier head to match the new task's number of classes and training only that part helps adapt a pretrained model to a new problem, reducing overfitting and improving validation performance.
Bonus Experiment
Try unfreezing some of the last pretrained layers along with the classifier head and fine-tune them together to see if validation accuracy improves further.
💡 Hint
Unfreeze the last few layers by setting requires_grad=True and use a smaller learning rate for these layers to avoid large weight updates.