Bird
Raised Fist0
Computer Visionml~5 mins

Image augmentation transforms in Computer Vision - Cheat Sheet & Quick Revision

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Recall & Review
beginner
What is the purpose of image augmentation in machine learning?
Image augmentation creates new, varied images from existing ones to help models learn better and avoid overfitting by seeing more diverse examples.
Click to reveal answer
beginner
Name three common image augmentation transforms.
Common transforms include rotation (turning the image), flipping (mirroring), and scaling (resizing).
Click to reveal answer
intermediate
How does random rotation help a model learn?
Random rotation shows the model the same object from different angles, helping it recognize objects regardless of orientation.
Click to reveal answer
beginner
What is the difference between horizontal flip and vertical flip?
Horizontal flip mirrors the image left to right, while vertical flip mirrors it top to bottom.
Click to reveal answer
intermediate
Why should augmentation transforms be applied carefully?
Because some transforms can change the meaning of the image (like flipping text), so they must keep the image realistic for the task.
Click to reveal answer
Which of these is NOT a typical image augmentation transform?
ASorting pixels
BFlipping
CAdding noise
DRotation
What does scaling an image do?
AChanges the image color
BChanges the image size
CFlips the image horizontally
DRotates the image
Why use random brightness adjustment in augmentation?
ATo make the image blurry
BTo crop the image
CTo simulate different lighting conditions
DTo flip the image
Which transform would help a model recognize objects regardless of orientation?
ACropping
BNoise addition
CColor inversion
DRotation
What is a risk of applying vertical flip to images with text?
AText becomes unreadable or meaningless
BImage size changes
CColors invert
DModel accuracy improves
Explain why image augmentation is important and list at least three common transforms.
Think about how augmentation helps models see more varied data.
You got /4 concepts.
    Describe how random rotation and flipping help improve model robustness.
    Consider how these transforms simulate real-world variations.
    You got /3 concepts.

      Practice

      (1/5)
      1. What is the main purpose of image augmentation in training machine learning models?
      easy
      A. To reduce the size of the training dataset
      B. To remove noise from images
      C. To create more varied training images by modifying originals
      D. To convert images to grayscale only

      Solution

      1. Step 1: Understand image augmentation

        Image augmentation means making small changes to original images to create new ones.
      2. Step 2: Purpose in training

        This helps models see more variety and learn better, avoiding overfitting.
      3. Final Answer:

        To create more varied training images by modifying originals -> Option C
      4. Quick Check:

        Image augmentation = create varied images [OK]
      Hint: Augmentation means changing images to get more training data [OK]
      Common Mistakes:
      • Thinking augmentation reduces dataset size
      • Confusing augmentation with noise removal
      • Assuming augmentation only changes color
      2. Which of the following is the correct way to apply a horizontal flip using PyTorch's torchvision transforms?
      easy
      A. transforms.RandomHorizontalFlip(p=1.0)
      B. transforms.HorizontalFlip()
      C. transforms.FlipHorizontal()
      D. transforms.RandomFlip(direction='horizontal')

      Solution

      1. Step 1: Recall torchvision syntax

        PyTorch uses transforms.RandomHorizontalFlip(p=probability) to flip images horizontally.
      2. Step 2: Check options

        Only transforms.RandomHorizontalFlip(p=1.0) matches the correct function and parameter style.
      3. Final Answer:

        transforms.RandomHorizontalFlip(p=1.0) -> Option A
      4. Quick Check:

        Correct PyTorch flip = RandomHorizontalFlip [OK]
      Hint: Look for 'RandomHorizontalFlip' with probability parameter [OK]
      Common Mistakes:
      • Using non-existent transform names
      • Missing the probability parameter
      • Confusing horizontal with vertical flip
      3. Given the following code snippet using torchvision transforms, what is the output image size after applying the transforms?
      transform = transforms.Compose([
          transforms.Resize((128, 128)),
          transforms.RandomCrop(100),
          transforms.ToTensor()
      ])
      
      image = Image.open('sample.jpg')
      output = transform(image)
      print(output.shape)
      medium
      A. [3, 128, 128]
      B. [3, 100, 100]
      C. [1, 100, 100]
      D. [3, 228, 228]

      Solution

      1. Step 1: Analyze each transform step

        First, image is resized to 128x128 pixels with 3 color channels (RGB). Then a random crop of size 100x100 is taken.
      2. Step 2: Determine output tensor shape

        After cropping, the image size is 100x100 with 3 channels. ToTensor() converts it to a tensor with shape [channels, height, width] = [3, 100, 100].
      3. Final Answer:

        [3, 100, 100] -> Option B
      4. Quick Check:

        Resize then crop = final size 100x100 [OK]
      Hint: Resize then crop means output size = crop size [OK]
      Common Mistakes:
      • Ignoring the crop step size
      • Confusing channel dimension with batch size
      • Assuming crop keeps original size
      4. The following code is intended to rotate an image by 45 degrees using torchvision transforms, but it raises an error. What is the mistake?
      transform = transforms.Compose([
          transforms.Rotate(45),
          transforms.ToTensor()
      ])
      
      image = Image.open('sample.jpg')
      output = transform(image)
      medium
      A. transforms.Rotate doesn't exist; should use transforms.functional.rotate or transforms.RandomRotation
      B. The angle 45 must be in radians, not degrees
      C. ToTensor must come before Rotate
      D. Image.open returns a tensor, so transform fails

      Solution

      1. Step 1: Check torchvision transform names

        There is no transforms.Rotate class. Rotation is done with transforms.RandomRotation or using functional API.
      2. Step 2: Identify correct usage

        To rotate by a fixed angle, use transforms.RandomRotation([45, 45]) or transforms.functional.rotate. The code as is will cause an AttributeError.
      3. Final Answer:

        transforms.Rotate doesn't exist; should use transforms.functional.rotate or transforms.RandomRotation -> Option A
      4. Quick Check:

        No transforms.Rotate in torchvision [OK]
      Hint: Check transform names carefully; Rotate is not a direct class [OK]
      Common Mistakes:
      • Using non-existent transform classes
      • Confusing degrees and radians
      • Wrong order of transforms
      5. You want to augment a dataset of images to improve model robustness. Which combination of transforms would best simulate real-world variations while keeping image size constant?
      hard
      A. transforms.RandomCrop(224), transforms.RandomRotation(180), transforms.Resize(128)
      B. transforms.Resize(256), transforms.CenterCrop(224), transforms.RandomVerticalFlip() only
      C. transforms.RandomRotation(90), transforms.RandomCrop(200), transforms.ToTensor()
      D. transforms.RandomResizedCrop(224), transforms.RandomHorizontalFlip(), transforms.ColorJitter(brightness=0.2, contrast=0.2)

      Solution

      1. Step 1: Understand augmentation goals

        We want to simulate real-world changes like size, flip, and color while keeping output size fixed.
      2. Step 2: Evaluate options

        transforms.RandomResizedCrop(224), transforms.RandomHorizontalFlip(), transforms.ColorJitter(brightness=0.2, contrast=0.2) resizes and crops randomly to 224x224, flips horizontally, and changes brightness/contrast, all common augmentations that keep size constant.
      3. Step 3: Check other options

        transforms.Resize(256), transforms.CenterCrop(224), transforms.RandomVerticalFlip() only flips vertically and crops but lacks color changes. transforms.RandomRotation(90), transforms.RandomCrop(200), transforms.ToTensor() changes size unpredictably and transforms.RandomCrop(224), transforms.RandomRotation(180), transforms.Resize(128) resizes after cropping, changing size.
      4. Final Answer:

        transforms.RandomResizedCrop(224), transforms.RandomHorizontalFlip(), transforms.ColorJitter(brightness=0.2, contrast=0.2) -> Option D
      5. Quick Check:

        Best augmentations keep size fixed and add variety [OK]
      Hint: Pick transforms that keep size fixed and add flip + color changes [OK]
      Common Mistakes:
      • Choosing transforms that change image size unpredictably
      • Ignoring color augmentations
      • Using only vertical flips which are less common