Data augmentation helps your model learn better by creating new, slightly changed versions of your images. This makes the model stronger and less likely to make mistakes.
Data augmentation with transforms in PyTorch
Start learning this pattern below
Jump into concepts and practice - no test required
from torchvision import transforms transform = transforms.Compose([ transforms.RandomHorizontalFlip(p=0.5), transforms.RandomRotation(degrees=30), transforms.ToTensor() ])
Compose lets you combine many transforms to apply one after another.
Transforms like RandomHorizontalFlip and RandomRotation change images randomly to create variety.
transform = transforms.RandomVerticalFlip(p=0.7)transform = transforms.ColorJitter(brightness=0.5, contrast=0.5)
transform = transforms.Compose([
transforms.RandomResizedCrop(224),
transforms.ToTensor()
])This program loads an image from the internet, applies horizontal flip and random rotation, then converts it to a tensor. It prints the original image size, the shape of the tensor, and the pixel value range.
import torch from torchvision import transforms from PIL import Image import requests from io import BytesIO # Load an example image from the web url = 'https://pytorch.org/assets/images/deeplab1.png' response = requests.get(url) img = Image.open(BytesIO(response.content)) # Define data augmentation transforms transform = transforms.Compose([ transforms.RandomHorizontalFlip(p=1.0), # Always flip horizontally transforms.RandomRotation(degrees=45), # Rotate randomly up to 45 degrees transforms.ToTensor() # Convert image to tensor ]) # Apply transform augmented_img = transform(img) # Show original and augmented image sizes and tensor shape print(f'Original image size: {img.size}') print(f'Augmented tensor shape: {augmented_img.shape}') # Check pixel value range print(f'Pixel value range: min={augmented_img.min():.3f}, max={augmented_img.max():.3f}')
Always convert images to tensors after augmentation to use them in PyTorch models.
Random transforms add variety but can make training results different each time.
You can combine many transforms to create powerful data augmentation pipelines.
Data augmentation creates new image versions to help models learn better.
Use transforms.Compose to combine multiple changes like flips and rotations.
Always convert images to tensors before feeding them to your model.
Practice
transforms.Compose in PyTorch data augmentation?Solution
Step 1: Understand the role of transforms.Compose
transforms.Composeis used to chain several image transformations so they apply sequentially to the input image.Step 2: Identify the correct purpose
It does not speed up training directly, convert images to numpy, or save images. Its main job is combining transformations.Final Answer:
To combine multiple image transformations into one pipeline -> Option AQuick Check:
transforms.Compose = combine transforms [OK]
- Thinking Compose speeds up training
- Confusing Compose with image saving
- Assuming Compose converts image formats
Solution
Step 1: Check the syntax for combining transforms
PyTorch requires transforms to be passed as a list insidetransforms.Compose([]).Step 2: Validate each option
transforms.Compose([transforms.RandomHorizontalFlip(), transforms.ToTensor()]) correctly uses a list inside Compose. The other options misuse function calls or pass arguments incorrectly.Final Answer:
transforms.Compose([transforms.RandomHorizontalFlip(), transforms.ToTensor()]) -> Option BQuick Check:
Compose needs list of transforms [OK]
- Passing transforms as separate arguments instead of a list
- Calling transforms inside each other incorrectly
- Forgetting to convert images to tensor
transform = transforms.Compose([
transforms.RandomRotation(90),
transforms.ToTensor()
])
output = transform(image)Solution
Step 1: Understand the transform effects
RandomRotation rotates the image but does not change its size or channels. ToTensor converts the image to a tensor with shape [channels, height, width].Step 2: Determine output shape
Input image is 3 channels, 64x64 pixels. After ToTensor, shape is [3, 64, 64]. Rotation keeps size same.Final Answer:
[3, 64, 64] -> Option CQuick Check:
ToTensor output shape = [channels, height, width] [OK]
- Confusing channel position in tensor shape
- Assuming rotation changes image size
- Thinking output is a numpy array shape
transform = transforms.Compose([
transforms.RandomCrop(32),
transforms.ToTensor,
transforms.Normalize((0.5,), (0.5,))
])Solution
Step 1: Check each transform usage
RandomCrop accepts an integer for size, so that is correct. Normalize accepts tuples for mean and std, so that is correct.Step 2: Identify the missing parentheses
transforms.ToTensoris a class, but it must be called astransforms.ToTensor()to create the transform instance.Final Answer:
transforms.ToTensor is missing parentheses to call it -> Option AQuick Check:
Call ToTensor() with parentheses [OK]
- Forgetting parentheses on transform classes
- Thinking Normalize needs lists instead of tuples
- Misunderstanding RandomCrop size argument
Solution
Step 1: Order of transforms matters
Data augmentation like flipping and rotation must happen before converting to tensor. Normalization happens after ToTensor.Step 2: Check each option's order and parameters
transforms.Compose([transforms.RandomHorizontalFlip(), transforms.RandomRotation(30), transforms.ToTensor(), transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))]) applies flip and rotation first, then ToTensor, then Normalize with correct mean/std for 3 channels. Others have wrong order or missing steps.Final Answer:
transforms.Compose([transforms.RandomHorizontalFlip(), transforms.RandomRotation(30), transforms.ToTensor(), transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))]) -> Option DQuick Check:
Augment before ToTensor, normalize after [OK]
- Normalizing before ToTensor
- Applying augmentations after ToTensor
- Using wrong mean/std shapes for Normalize
