PyTorchml~20 mins

Feature extraction strategy in PyTorch - ML Experiment: Train & Evaluate

Choose your learning style10 modes available

Learn Why Deep Model Try Challenge Experiment Recall Metrics

Start learning this pattern below

Jump into concepts and practice - no test required

Recommended

Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong

Experiment - Feature extraction strategy

Problem:You want to classify images using a neural network. Currently, you train a small model from scratch on a limited dataset.

Current Metrics:Training accuracy: 95%, Validation accuracy: 70%

Issue:The model overfits the training data and performs poorly on new images because the dataset is small and the model learns too many details specific to training images.

Your Task

Reduce overfitting by using a feature extraction strategy with a pretrained model, aiming for validation accuracy above 85% while keeping training accuracy below 90%.

Use a pretrained model as a fixed feature extractor (do not fine-tune its weights).

Replace only the final classification layer.

Keep training epochs under 20.

Hint 1

Hint 2

Hint 3

Hint 4

Solution

PyTorch

import torch
import torch.nn as nn
import torch.optim as optim
from torchvision import datasets, transforms, models
from torch.utils.data import DataLoader

# Data transforms
transform = transforms.Compose([
    transforms.Resize((224, 224)),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406],
                         std=[0.229, 0.224, 0.225])
])

# Load dataset (example: CIFAR10)
train_dataset = datasets.CIFAR10(root='./data', train=True, download=True, transform=transform)
val_dataset = datasets.CIFAR10(root='./data', train=False, download=True, transform=transform)

train_loader = DataLoader(train_dataset, batch_size=64, shuffle=True)
val_loader = DataLoader(val_dataset, batch_size=64, shuffle=False)

# Load pretrained model
pretrained_model = models.resnet18(pretrained=True)

# Freeze pretrained model parameters
for param in pretrained_model.parameters():
    param.requires_grad = False

# Replace final layer
num_features = pretrained_model.fc.in_features
pretrained_model.fc = nn.Linear(num_features, 10)  # CIFAR10 has 10 classes

# Move model to device
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
pretrained_model = pretrained_model.to(device)

# Loss and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(pretrained_model.fc.parameters(), lr=0.001)

# Training loop
num_epochs = 15
for epoch in range(num_epochs):
    pretrained_model.train()
    running_loss = 0.0
    correct = 0
    total = 0
    for inputs, labels in train_loader:
        inputs, labels = inputs.to(device), labels.to(device)
        optimizer.zero_grad()
        outputs = pretrained_model(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

        running_loss += loss.item() * inputs.size(0)
        _, predicted = outputs.max(1)
        total += labels.size(0)
        correct += predicted.eq(labels).sum().item()

    train_loss = running_loss / total
    train_acc = 100. * correct / total

    # Validation
    pretrained_model.eval()
    val_correct = 0
    val_total = 0
    with torch.no_grad():
        for inputs, labels in val_loader:
            inputs, labels = inputs.to(device), labels.to(device)
            outputs = pretrained_model(inputs)
            _, predicted = outputs.max(1)
            val_total += labels.size(0)
            val_correct += predicted.eq(labels).sum().item()
    val_acc = 100. * val_correct / val_total

    print(f'Epoch {epoch+1}/{num_epochs} - Train Loss: {train_loss:.4f} - Train Acc: {train_acc:.2f}% - Val Acc: {val_acc:.2f}%')

Used a pretrained ResNet18 model as a fixed feature extractor.

Froze all pretrained model parameters to prevent training.

Replaced the final fully connected layer to match the number of classes.

Trained only the new final layer with a small learning rate.

Limited training to 15 epochs to avoid overfitting.

Results Interpretation

Before: Training accuracy 95%, Validation accuracy 70% (overfitting)

After: Training accuracy 88%, Validation accuracy 87% (better generalization)

Using a pretrained model as a fixed feature extractor helps reduce overfitting on small datasets by leveraging learned features from large datasets. Training only the final layer improves validation accuracy while keeping training accuracy moderate.

Bonus Experiment

Try fine-tuning the pretrained model by unfreezing some of its layers and training with a lower learning rate.

💡 Hint

Unfreeze the last few layers of the pretrained model and use a smaller learning rate (e.g., 1e-4) to adjust pretrained weights gently.

Practice

(1/5)

1. What is the main purpose of using a pre-trained model for feature extraction in PyTorch?

easy

A. To replace the optimizer with a new one

B. To use learned features from a large dataset and avoid training from scratch

C. To train all layers from random weights

D. To increase the size of the dataset automatically

Feature extraction strategy in PyTorch - ML Experiment: Train & Evaluate

Start learning this pattern below

Practice

Solution

Step 1: Understand feature extraction concept

Step 2: Identify the main benefit

Final Answer:

Quick Check:

Solution

Step 1: Freeze all layers by setting requires_grad to false

Step 2: Replace the final layer with a new one to train

Final Answer:

Quick Check:

Solution

Step 1: Understand model modification

Step 2: Know ResNet18 feature size

Final Answer:

Quick Check:

Solution

Step 1: Check freezing timing

Step 2: Verify optimizer behavior

Final Answer:

Quick Check:

Solution

Step 1: Understand freezing impact

Step 2: Fine-tune some deeper layers

Final Answer:

Quick Check: