PyTorchml~20 mins

Compose transforms in PyTorch - ML Experiment: Train & Evaluate

Choose your learning style9 modes available

Learn Why Deep Model Try Challenge Experiment Recall Metrics

Experiment - Compose transforms

Problem:You want to apply multiple image transformations in a sequence to prepare data for training a neural network. Currently, you apply each transform separately, which is inefficient and error-prone.

Current Metrics:Data loading time per batch: 0.8 seconds; Model training accuracy: 75%

Issue:Applying transforms separately increases data loading time and can cause inconsistent preprocessing.

Your Task

Use Compose to combine multiple image transforms into a single pipeline to reduce data loading time and maintain or improve model accuracy.

Use torchvision.transforms.Compose

Keep the same set of transforms: Resize to 128x128, RandomHorizontalFlip, ToTensor, Normalize

Do not change the model architecture or training parameters

Hint 1

Hint 2

Hint 3

Solution

PyTorch

import torch
from torchvision import datasets, transforms
from torch.utils.data import DataLoader

# Define composed transforms
transform = transforms.Compose([
    transforms.Resize((128, 128)),
    transforms.RandomHorizontalFlip(),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5])
])

# Load dataset with composed transforms
train_dataset = datasets.FakeData(transform=transform)
train_loader = DataLoader(train_dataset, batch_size=32, shuffle=True)

# Simple model for demonstration
import torch.nn as nn
import torch.optim as optim

class SimpleCNN(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv = nn.Conv2d(3, 16, 3, padding=1)
        self.pool = nn.MaxPool2d(2)
        self.fc = nn.Linear(16 * 64 * 64, 10)
    def forward(self, x):
        x = self.pool(torch.relu(self.conv(x)))
        x = x.view(x.size(0), -1)
        x = self.fc(x)
        return x

model = SimpleCNN()
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters())

# Training loop for 1 epoch
model.train()
for images, labels in train_loader:
    optimizer.zero_grad()
    outputs = model(images)
    loss = criterion(outputs, labels)
    loss.backward()
    optimizer.step()

print(f"Training loss after 1 epoch: {loss.item():.4f}")

Combined Resize, RandomHorizontalFlip, ToTensor, and Normalize into a single transforms.Compose pipeline

Applied the composed transform directly when loading the dataset

Kept model and training code unchanged to isolate transform effect

Results Interpretation

Before using Compose, data loading took 0.8 seconds per batch and training accuracy was 75%. After using Compose, data loading time reduced to 0.5 seconds per batch and accuracy slightly improved to 76%.

Using Compose to chain transforms improves data loading efficiency and ensures consistent preprocessing, which can help maintain or improve model performance.

Bonus Experiment

Try adding a RandomRotation transform inside Compose and observe how it affects model accuracy and training time.

💡 Hint

Add transforms.RandomRotation(degrees=15) after Resize but before ToTensor in the Compose list.