Bird
Raised Fist0
PyTorchml~5 mins

Data augmentation in PyTorch - Cheat Sheet & Quick Revision

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Recall & Review
beginner
What is data augmentation in machine learning?
Data augmentation is a technique to increase the size and diversity of training data by making small changes to existing data, like flipping or rotating images. This helps models learn better and avoid overfitting.
Click to reveal answer
beginner
Name two common image data augmentation techniques.
Two common image data augmentation techniques are flipping (horizontal or vertical) and rotation by small angles. These create new images that help the model see different views of the same object.
Click to reveal answer
intermediate
How does data augmentation help prevent overfitting?
Data augmentation adds variety to training data, so the model doesn't memorize exact examples. This makes the model generalize better to new data, reducing overfitting.
Click to reveal answer
beginner
Show a simple PyTorch code snippet to apply random horizontal flip to images during training.
from torchvision import transforms
transform = transforms.Compose([
    transforms.RandomHorizontalFlip(p=0.5),
    transforms.ToTensor()
])

This code flips images horizontally with 50% chance before converting them to tensors.
Click to reveal answer
intermediate
What is the difference between online and offline data augmentation?
Offline augmentation creates new data files before training, increasing dataset size on disk. Online augmentation applies random changes on the fly during training, saving storage and adding variety each epoch.
Click to reveal answer
Which of the following is NOT a typical data augmentation technique for images?
ARandom rotation
BHorizontal flip
CChanging image file format
DAdding Gaussian noise
Why do we use data augmentation in training machine learning models?
ATo reduce training time
BTo increase dataset size and variety
CTo make the model smaller
DTo remove noisy data
In PyTorch, which module provides common data augmentation transforms?
Atorch.utils.data
Btorch.optim
Ctorch.nn
Dtorchvision.transforms
What does RandomHorizontalFlip(p=0.5) do during training?
AFlips images horizontally with 50% chance
BFlips every image horizontally
CFlips images vertically with 50% chance
DDoes nothing
Which is a benefit of online data augmentation over offline augmentation?
AAdds variety each training epoch
BRequires more disk space
CSlows down training significantly
DCreates fixed augmented dataset
Explain what data augmentation is and why it is useful in training machine learning models.
Think about how changing images slightly can help a model learn better.
You got /4 concepts.
    Describe how you would implement data augmentation in a PyTorch image classification project.
    Focus on the code steps to add augmentation before feeding images to the model.
    You got /4 concepts.

      Practice

      (1/5)
      1. What is the main purpose of data augmentation in PyTorch training pipelines?
      easy
      A. To reduce the size of the training dataset
      B. To create new training data by modifying existing data
      C. To speed up model training by skipping data preprocessing
      D. To convert data into a different file format

      Solution

      1. Step 1: Understand data augmentation concept

        Data augmentation means making new training examples by changing existing ones, like flipping or rotating images.
      2. Step 2: Identify the purpose in training

        This helps the model see more variety and avoid memorizing only the original data, improving learning.
      3. Final Answer:

        To create new training data by modifying existing data -> Option B
      4. Quick Check:

        Data augmentation = create new data [OK]
      Hint: Data augmentation means changing data to get more examples [OK]
      Common Mistakes:
      • Thinking it reduces dataset size
      • Confusing augmentation with speeding training
      • Believing it changes file formats
      2. Which of the following is the correct way to apply a random horizontal flip to an image tensor using torchvision transforms?
      easy
      A. transforms.RandomHorizontalFlip(p=0.5)
      B. transforms.HorizontalFlip(prob=0.5)
      C. transforms.RandomFlip(direction='horizontal')
      D. transforms.FlipHorizontal(0.5)

      Solution

      1. Step 1: Recall torchvision transform syntax

        The correct transform for horizontal flip is RandomHorizontalFlip with a probability parameter p.
      2. Step 2: Match correct syntax

        transforms.RandomHorizontalFlip(p=0.5) uses transforms.RandomHorizontalFlip(p=0.5), which is the exact PyTorch syntax.
      3. Final Answer:

        transforms.RandomHorizontalFlip(p=0.5) -> Option A
      4. Quick Check:

        Correct transform name and parameter = C [OK]
      Hint: Look for 'RandomHorizontalFlip' with p= probability [OK]
      Common Mistakes:
      • Using wrong transform names
      • Using 'prob' instead of 'p'
      • Incorrect parameter names or missing parentheses
      3. What will be the output shape of the image tensor after applying the following transform?
      transform = transforms.Compose([
          transforms.RandomRotation(30),
          transforms.ToTensor()
      ])
      
      image = Image.open('sample.jpg')
      tensor_image = transform(image)
      print(tensor_image.shape)
      medium
      A. [3, H, W] where H and W are original image height and width
      B. [H, W, 3] where H and W are original image height and width
      C. [1, H, W] grayscale image shape
      D. [3, 30, 30] fixed size after rotation

      Solution

      1. Step 1: Understand transforms.Compose and RandomRotation

        RandomRotation rotates the image but keeps the original size (height and width). ToTensor converts the image to a tensor with shape [channels, height, width].
      2. Step 2: Determine output tensor shape

        Since the image is color (3 channels), the tensor shape will be [3, H, W], where H and W are original height and width.
      3. Final Answer:

        [3, H, W] where H and W are original image height and width -> Option A
      4. Quick Check:

        Rotation keeps size, ToTensor outputs [3, H, W] [OK]
      Hint: ToTensor outputs [channels, height, width] shape [OK]
      Common Mistakes:
      • Confusing channel order as last dimension
      • Assuming rotation changes image size
      • Thinking output is grayscale shape
      4. Identify the error in this PyTorch data augmentation code snippet:
      transform = transforms.Compose([
          transforms.RandomHorizontalFlip(prob=0.5),
          transforms.RandomRotation(degrees=45),
          transforms.ToTensor()
      ])
      medium
      A. RandomRotation degrees must be a tuple, not a single number
      B. ToTensor should come before RandomRotation
      C. RandomHorizontalFlip should use keyword argument p=0.5
      D. Compose cannot combine multiple transforms

      Solution

      1. Step 1: Check RandomHorizontalFlip usage

        RandomHorizontalFlip requires the probability argument as p=0.5, not prob=0.5.
      2. Step 2: Verify other transforms

        RandomRotation accepts a single number for degrees, ToTensor can come last, and Compose supports multiple transforms.
      3. Final Answer:

        RandomHorizontalFlip should use keyword argument p=0.5 -> Option C
      4. Quick Check:

        Correct argument name = p [OK]
      Hint: Check argument names carefully in transform constructors [OK]
      Common Mistakes:
      • Passing positional argument instead of keyword
      • Thinking degrees must be tuple
      • Misordering transforms in Compose
      5. You want to augment a dataset of images to improve model robustness. Which combination of transforms would best increase variety without changing image size or color channels?
      Options:
      A) RandomHorizontalFlip(p=0.5) + RandomRotation(15) + ColorJitter(brightness=0.2)
      B) RandomResizedCrop(size=224) + Grayscale(num_output_channels=1)
      C) RandomVerticalFlip(p=1.0) + RandomRotation(90) + ToTensor()
      D) Resize(128) + RandomCrop(64) + RandomHorizontalFlip(p=0.5)
      hard
      A. Resize and crop to smaller size (changes image size)
      B. RandomResizedCrop and converting to grayscale (changes size and channels)
      C. Vertical flip and 90-degree rotation (may change orientation drastically)
      D. RandomHorizontalFlip, small RandomRotation, and ColorJitter to vary brightness

      Solution

      1. Step 1: Analyze each option's effect on size and channels

        RandomHorizontalFlip, small RandomRotation, and ColorJitter to vary brightness flips, rotates slightly, and changes brightness without resizing or changing channels. RandomResizedCrop and converting to grayscale (changes size and channels) changes size and converts to grayscale. Vertical flip and 90-degree rotation (may change orientation drastically) rotates 90 degrees and flips vertically, which changes orientation drastically. Resize and crop to smaller size (changes image size) resizes and crops, changing size.
      2. Step 2: Choose the option that keeps size and channels but increases variety

        RandomHorizontalFlip, small RandomRotation, and ColorJitter to vary brightness best fits the requirement by augmenting with flips, small rotations, and brightness changes without altering size or channels.
      3. Final Answer:

        RandomHorizontalFlip, small RandomRotation, and ColorJitter to vary brightness -> Option D
      4. Quick Check:

        Keep size and channels, add mild augmentations = A [OK]
      Hint: Pick augmentations that don't resize or change color channels [OK]
      Common Mistakes:
      • Choosing transforms that resize images
      • Converting images to grayscale unintentionally
      • Using large rotations that distort orientation