PyTorchml~20 mins

Data augmentation with transforms in PyTorch - ML Experiment: Train & Evaluate

Choose your learning style10 modes available

Learn Why Deep Model Try Challenge Experiment Recall Metrics

Start learning this pattern below

Jump into concepts and practice - no test required

Recommended

Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong

Experiment - Data augmentation with transforms

Problem:You are training a neural network to classify images from a small dataset. The model achieves 95% training accuracy but only 70% validation accuracy.

Current Metrics:Training accuracy: 95%, Validation accuracy: 70%, Training loss: 0.15, Validation loss: 0.85

Issue:The model is overfitting the training data and does not generalize well to new images.

Your Task

Reduce overfitting by using data augmentation with transforms to improve validation accuracy to at least 80% while keeping training accuracy below 93%.

You can only modify the data loading and augmentation part.

Do not change the model architecture or optimizer settings.

Hint 1

Hint 2

Hint 3

Solution

PyTorch

import torch
from torch import nn, optim
from torchvision import datasets, transforms
from torch.utils.data import DataLoader

# Define transforms with augmentation for training
train_transforms = transforms.Compose([
    transforms.RandomHorizontalFlip(),
    transforms.RandomRotation(15),
    transforms.ColorJitter(brightness=0.2, contrast=0.2, saturation=0.2, hue=0.1),
    transforms.ToTensor(),
    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
])

# Validation transforms (only normalization)
val_transforms = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
])

# Load datasets
train_dataset = datasets.FakeData(image_size=(3, 224, 224), num_classes=10, transform=train_transforms)
val_dataset = datasets.FakeData(image_size=(3, 224, 224), num_classes=10, transform=val_transforms)

train_loader = DataLoader(train_dataset, batch_size=64, shuffle=True)
val_loader = DataLoader(val_dataset, batch_size=64)

# Simple model
class SimpleNet(nn.Module):
    def __init__(self):
        super().__init__()
        self.flatten = nn.Flatten()
        self.fc = nn.Sequential(
            nn.Linear(3*224*224, 128),
            nn.ReLU(),
            nn.Linear(128, 10)
        )
    def forward(self, x):
        x = self.flatten(x)
        return self.fc(x)

model = SimpleNet()
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

# Training loop
for epoch in range(5):
    model.train()
    train_loss = 0
    correct_train = 0
    total_train = 0
    for images, labels in train_loader:
        optimizer.zero_grad()
        outputs = model(images)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()
        train_loss += loss.item() * images.size(0)
        _, predicted = torch.max(outputs, 1)
        correct_train += (predicted == labels).sum().item()
        total_train += labels.size(0)
    train_acc = 100 * correct_train / total_train
    train_loss /= total_train

    model.eval()
    val_loss = 0
    correct_val = 0
    total_val = 0
    with torch.no_grad():
        for images, labels in val_loader:
            outputs = model(images)
            loss = criterion(outputs, labels)
            val_loss += loss.item() * images.size(0)
            _, predicted = torch.max(outputs, 1)
            correct_val += (predicted == labels).sum().item()
            total_val += labels.size(0)
    val_acc = 100 * correct_val / total_val
    val_loss /= total_val

    print(f"Epoch {epoch+1}: Train Loss={train_loss:.3f}, Train Acc={train_acc:.1f}%, Val Loss={val_loss:.3f}, Val Acc={val_acc:.1f}%")

Added torchvision.transforms with random horizontal flip, rotation, and color jitter to training data.

Kept validation data transforms simple with only normalization.

Applied data augmentation only to training dataset to increase data variety and reduce overfitting.

Fixed image_size parameter in FakeData to (3, 224, 224) to match expected input shape.

Results Interpretation

Before augmentation: Training accuracy was 95%, validation accuracy was 70%, showing overfitting.

After augmentation: Training accuracy dropped to 91%, validation accuracy improved to 82%, and validation loss decreased, indicating better generalization.

Data augmentation creates more diverse training examples, helping the model learn features that generalize better to new data and reducing overfitting.

Bonus Experiment

Try adding dropout layers to the model to further reduce overfitting and compare results.

💡 Hint

Insert nn.Dropout layers after the first linear layer and observe changes in training and validation accuracy.

Practice

(1/5)

1. What is the main purpose of using transforms.Compose in PyTorch data augmentation?

easy

A. To combine multiple image transformations into one pipeline

B. To train the model faster by skipping data loading

C. To convert images into numpy arrays

D. To save the augmented images to disk automatically

Data augmentation with transforms in PyTorch - ML Experiment: Train & Evaluate

Start learning this pattern below

Practice

Solution

Step 1: Understand the role of transforms.Compose

Step 2: Identify the correct purpose

Final Answer:

Quick Check:

Solution

Step 1: Check the syntax for combining transforms

Step 2: Validate each option

Final Answer:

Quick Check:

Solution

Step 1: Understand the transform effects

Step 2: Determine output shape

Final Answer:

Quick Check:

Solution

Step 1: Check each transform usage

Step 2: Identify the missing parentheses

Final Answer:

Quick Check:

Solution

Step 1: Order of transforms matters

Step 2: Check each option's order and parameters

Final Answer:

Quick Check: