Data augmentation helps create more training examples by changing existing data. This makes models learn better and avoid mistakes.
Data augmentation in PyTorch
Start learning this pattern below
Jump into concepts and practice - no test required
import torchvision.transforms as transforms transform = transforms.Compose([ transforms.RandomHorizontalFlip(), transforms.RandomRotation(30), transforms.ToTensor() ])
Use transforms.Compose to combine multiple augmentations.
Apply augmentations only on training data, not on validation or test data.
transforms.RandomHorizontalFlip(p=0.5)transforms.RandomRotation(degrees=45)transforms.ColorJitter(brightness=0.2, contrast=0.2)
transforms.RandomResizedCrop(size=224)This code loads the CIFAR10 training data and applies random horizontal flip and rotation to each image. It then prints the shape of one batch of images and their labels.
import torch from torchvision import datasets, transforms from torch.utils.data import DataLoader # Define data augmentation transforms transform = transforms.Compose([ transforms.RandomHorizontalFlip(), transforms.RandomRotation(20), transforms.ToTensor() ]) # Load CIFAR10 training dataset with augmentation train_dataset = datasets.CIFAR10(root='./data', train=True, download=True, transform=transform) train_loader = DataLoader(train_dataset, batch_size=4, shuffle=True) # Get one batch of images and labels images, labels = next(iter(train_loader)) print(f'Batch image tensor shape: {images.shape}') print(f'Batch labels: {labels}')
Always apply the same normalization after augmentation to keep data consistent.
Augmentation increases training time but improves model generalization.
Do not apply random augmentations to validation or test sets to get fair evaluation.
Data augmentation creates new training data by changing existing data.
It helps models learn better and avoid overfitting.
Use torchvision transforms like RandomHorizontalFlip and RandomRotation in PyTorch.
Practice
Solution
Step 1: Understand data augmentation concept
Data augmentation means making new training examples by changing existing ones, like flipping or rotating images.Step 2: Identify the purpose in training
This helps the model see more variety and avoid memorizing only the original data, improving learning.Final Answer:
To create new training data by modifying existing data -> Option BQuick Check:
Data augmentation = create new data [OK]
- Thinking it reduces dataset size
- Confusing augmentation with speeding training
- Believing it changes file formats
Solution
Step 1: Recall torchvision transform syntax
The correct transform for horizontal flip is RandomHorizontalFlip with a probability parameter p.Step 2: Match correct syntax
transforms.RandomHorizontalFlip(p=0.5) uses transforms.RandomHorizontalFlip(p=0.5), which is the exact PyTorch syntax.Final Answer:
transforms.RandomHorizontalFlip(p=0.5) -> Option AQuick Check:
Correct transform name and parameter = C [OK]
- Using wrong transform names
- Using 'prob' instead of 'p'
- Incorrect parameter names or missing parentheses
transform = transforms.Compose([
transforms.RandomRotation(30),
transforms.ToTensor()
])
image = Image.open('sample.jpg')
tensor_image = transform(image)
print(tensor_image.shape)Solution
Step 1: Understand transforms.Compose and RandomRotation
RandomRotation rotates the image but keeps the original size (height and width). ToTensor converts the image to a tensor with shape [channels, height, width].Step 2: Determine output tensor shape
Since the image is color (3 channels), the tensor shape will be [3, H, W], where H and W are original height and width.Final Answer:
[3, H, W] where H and W are original image height and width -> Option AQuick Check:
Rotation keeps size, ToTensor outputs [3, H, W] [OK]
- Confusing channel order as last dimension
- Assuming rotation changes image size
- Thinking output is grayscale shape
transform = transforms.Compose([
transforms.RandomHorizontalFlip(prob=0.5),
transforms.RandomRotation(degrees=45),
transforms.ToTensor()
])Solution
Step 1: Check RandomHorizontalFlip usage
RandomHorizontalFlip requires the probability argument as p=0.5, not prob=0.5.Step 2: Verify other transforms
RandomRotation accepts a single number for degrees, ToTensor can come last, and Compose supports multiple transforms.Final Answer:
RandomHorizontalFlip should use keyword argument p=0.5 -> Option CQuick Check:
Correct argument name = p [OK]
- Passing positional argument instead of keyword
- Thinking degrees must be tuple
- Misordering transforms in Compose
Options: A) RandomHorizontalFlip(p=0.5) + RandomRotation(15) + ColorJitter(brightness=0.2) B) RandomResizedCrop(size=224) + Grayscale(num_output_channels=1) C) RandomVerticalFlip(p=1.0) + RandomRotation(90) + ToTensor() D) Resize(128) + RandomCrop(64) + RandomHorizontalFlip(p=0.5)
Solution
Step 1: Analyze each option's effect on size and channels
RandomHorizontalFlip, small RandomRotation, and ColorJitter to vary brightness flips, rotates slightly, and changes brightness without resizing or changing channels. RandomResizedCrop and converting to grayscale (changes size and channels) changes size and converts to grayscale. Vertical flip and 90-degree rotation (may change orientation drastically) rotates 90 degrees and flips vertically, which changes orientation drastically. Resize and crop to smaller size (changes image size) resizes and crops, changing size.Step 2: Choose the option that keeps size and channels but increases variety
RandomHorizontalFlip, small RandomRotation, and ColorJitter to vary brightness best fits the requirement by augmenting with flips, small rotations, and brightness changes without altering size or channels.Final Answer:
RandomHorizontalFlip, small RandomRotation, and ColorJitter to vary brightness -> Option DQuick Check:
Keep size and channels, add mild augmentations = A [OK]
- Choosing transforms that resize images
- Converting images to grayscale unintentionally
- Using large rotations that distort orientation
