PyTorchml~20 mins

torchvision pre-trained models in PyTorch - ML Experiment: Train & Evaluate

Choose your learning style10 modes available

Learn Why Deep Model Try Challenge Experiment Recall Metrics

Start learning this pattern below

Jump into concepts and practice - no test required

Recommended

Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong

Experiment - torchvision pre-trained models

Problem:You want to classify images into categories using a deep learning model. You are using a torchvision pre-trained model (ResNet18) on a small custom dataset. The model trains quickly and achieves 98% accuracy on the training set but only 70% accuracy on the validation set.

Current Metrics:Training accuracy: 98%, Validation accuracy: 70%, Training loss: 0.05, Validation loss: 0.85

Issue:The model is overfitting: it performs very well on training data but poorly on validation data.

Your Task

Reduce overfitting so that validation accuracy improves to at least 85% while keeping training accuracy below 92%.

You must use the torchvision pre-trained ResNet18 model.

You can only modify training hyperparameters and add regularization techniques.

Do not change the dataset or model architecture drastically.

Hint 1

Hint 2

Hint 3

Hint 4

Solution

PyTorch

import torch
import torch.nn as nn
import torch.optim as optim
from torchvision import datasets, transforms, models
from torch.utils.data import DataLoader

# Data augmentation and normalization for training
train_transforms = transforms.Compose([
    transforms.RandomResizedCrop(224),
    transforms.RandomHorizontalFlip(),
    transforms.ToTensor(),
    transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
])

# Normalization for validation
val_transforms = transforms.Compose([
    transforms.Resize(256),
    transforms.CenterCrop(224),
    transforms.ToTensor(),
    transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
])

# Load datasets
train_dataset = datasets.FakeData(transform=train_transforms)  # Replace with real dataset
val_dataset = datasets.FakeData(transform=val_transforms)      # Replace with real dataset

train_loader = DataLoader(train_dataset, batch_size=32, shuffle=True)
val_loader = DataLoader(val_dataset, batch_size=32, shuffle=False)

# Load pretrained ResNet18
model = models.resnet18(pretrained=True)
num_ftrs = model.fc.in_features
model.fc = nn.Sequential(
    nn.Dropout(0.5),  # Added dropout
    nn.Linear(num_ftrs, 10)  # Assuming 10 classes
)

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model = model.to(device)

criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.0005, weight_decay=1e-4)  # Added weight decay

num_epochs = 10
best_val_acc = 0.0

for epoch in range(num_epochs):
    model.train()
    running_loss = 0.0
    running_corrects = 0
    total = 0
    for inputs, labels in train_loader:
        inputs, labels = inputs.to(device), labels.to(device)
        optimizer.zero_grad()
        outputs = model(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()
        running_loss += loss.item() * inputs.size(0)
        _, preds = torch.max(outputs, 1)
        running_corrects += torch.sum(preds == labels.data)
        total += labels.size(0)
    train_loss = running_loss / total
    train_acc = running_corrects.double() / total

    model.eval()
    val_loss = 0.0
    val_corrects = 0
    val_total = 0
    with torch.no_grad():
        for inputs, labels in val_loader:
            inputs, labels = inputs.to(device), labels.to(device)
            outputs = model(inputs)
            loss = criterion(outputs, labels)
            val_loss += loss.item() * inputs.size(0)
            _, preds = torch.max(outputs, 1)
            val_corrects += torch.sum(preds == labels.data)
            val_total += labels.size(0)
    val_loss /= val_total
    val_acc = val_corrects.double() / val_total

    if val_acc > best_val_acc:
        best_val_acc = val_acc

    print(f'Epoch {epoch+1}/{num_epochs} - '
          f'Train loss: {train_loss:.4f}, Train acc: {train_acc:.4f} - '
          f'Val loss: {val_loss:.4f}, Val acc: {val_acc:.4f}')

Added data augmentation to training data to increase variety.

Added dropout layer before the final fully connected layer to reduce overfitting.

Used weight decay (L2 regularization) in the Adam optimizer.

Reduced learning rate to 0.0005 for smoother training.

Results Interpretation

Before: Training accuracy 98%, Validation accuracy 70%, Training loss 0.05, Validation loss 0.85

After: Training accuracy 90%, Validation accuracy 87%, Training loss 0.25, Validation loss 0.40

Adding dropout, weight decay, and data augmentation helps reduce overfitting. The model generalizes better, improving validation accuracy while slightly lowering training accuracy.

Bonus Experiment

Try using a different pretrained model like MobileNetV2 and compare the validation accuracy and training time.

💡 Hint

MobileNetV2 is lighter and faster. Adjust learning rate and batch size accordingly.

Practice

(1/5)

1. What is the main advantage of using torchvision pre-trained models?

easy

A. They automatically improve your dataset quality.

B. They generate new images from text descriptions.

C. They reduce the size of your images.

D. They allow you to use powerful image models without training from scratch.

torchvision pre-trained models in PyTorch - ML Experiment: Train & Evaluate

Start learning this pattern below

Practice

Solution

Step 1: Understand what pre-trained models do

Step 2: Identify the main benefit

Final Answer:

Quick Check:

Solution

Step 1: Recall the updated torchvision syntax

Step 2: Identify the correct syntax for ResNet18

Final Answer:

Quick Check:

Solution

Step 1: Understand ResNet18 output size

Step 2: Check input batch size and output shape

Final Answer:

Quick Check:

Solution

Step 1: Check model mode for prediction

Step 2: Identify the missing step

Final Answer:

Quick Check:

Solution

Step 1: Identify the final layer of ResNet18

Step 2: Replace final layer for 5 classes

Final Answer:

Quick Check: