Bird
Raised Fist0
Computer Visionml~10 mins

Image augmentation transforms in Computer Vision - Interactive Code Practice

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Practice - 5 Tasks
Answer the questions below
1fill in blank
easy

Complete the code to apply a horizontal flip to the image using torchvision transforms.

Computer Vision
transform = torchvision.transforms.Compose([torchvision.transforms.[1]()])
Drag options to blanks, or click blank then click option'
ARandomHorizontalFlip
BRandomVerticalFlip
CColorJitter
DRandomRotation
Attempts:
3 left
💡 Hint
Common Mistakes
Using RandomVerticalFlip instead flips the image upside down.
Using ColorJitter changes colors, not flips.
RandomRotation rotates the image, not flips.
2fill in blank
medium

Complete the code to apply a random rotation of up to 30 degrees to the image.

Computer Vision
transform = torchvision.transforms.RandomRotation([1])
Drag options to blanks, or click blank then click option'
A15
B30
C45
D60
Attempts:
3 left
💡 Hint
Common Mistakes
Choosing 15 rotates less than asked.
Choosing 45 or 60 rotates more than asked.
3fill in blank
hard

Fix the error in the code to correctly apply color jitter with brightness change.

Computer Vision
transform = torchvision.transforms.ColorJitter(brightness=[1])
Drag options to blanks, or click blank then click option'
A'0.5'
B50
C0.5
DTrue
Attempts:
3 left
💡 Hint
Common Mistakes
Using a string '0.5' causes a type error.
Using 50 is too large and invalid.
Using True is not a valid brightness value.
4fill in blank
hard

Fill both blanks to create a transform that resizes images to 128x128 and then converts them to tensors.

Computer Vision
transform = torchvision.transforms.Compose([torchvision.transforms.Resize([1]), torchvision.transforms.[2]()])
Drag options to blanks, or click blank then click option'
A(128, 128)
BToTensor
CNormalize
DCenterCrop
Attempts:
3 left
💡 Hint
Common Mistakes
Using Normalize instead of ToTensor won't convert image format.
Using CenterCrop changes image size differently.
Passing a single int instead of tuple to Resize changes aspect ratio.
5fill in blank
hard

Fill all three blanks to create a transform pipeline that randomly crops 100x100 patches, applies horizontal flip, and normalizes with mean 0.5 and std 0.5.

Computer Vision
transform = torchvision.transforms.Compose([
    torchvision.transforms.RandomCrop([1]),
    torchvision.transforms.[2](),
    torchvision.transforms.Normalize(mean=[[3]], std=[0.5])
])
Drag options to blanks, or click blank then click option'
A(100, 100)
BRandomHorizontalFlip
C0.5
DRandomVerticalFlip
Attempts:
3 left
💡 Hint
Common Mistakes
Using RandomVerticalFlip flips vertically, not horizontally.
Using mean other than 0.5 changes normalization.
Passing single int instead of tuple to RandomCrop changes behavior.

Practice

(1/5)
1. What is the main purpose of image augmentation in training machine learning models?
easy
A. To reduce the size of the training dataset
B. To remove noise from images
C. To create more varied training images by modifying originals
D. To convert images to grayscale only

Solution

  1. Step 1: Understand image augmentation

    Image augmentation means making small changes to original images to create new ones.
  2. Step 2: Purpose in training

    This helps models see more variety and learn better, avoiding overfitting.
  3. Final Answer:

    To create more varied training images by modifying originals -> Option C
  4. Quick Check:

    Image augmentation = create varied images [OK]
Hint: Augmentation means changing images to get more training data [OK]
Common Mistakes:
  • Thinking augmentation reduces dataset size
  • Confusing augmentation with noise removal
  • Assuming augmentation only changes color
2. Which of the following is the correct way to apply a horizontal flip using PyTorch's torchvision transforms?
easy
A. transforms.RandomHorizontalFlip(p=1.0)
B. transforms.HorizontalFlip()
C. transforms.FlipHorizontal()
D. transforms.RandomFlip(direction='horizontal')

Solution

  1. Step 1: Recall torchvision syntax

    PyTorch uses transforms.RandomHorizontalFlip(p=probability) to flip images horizontally.
  2. Step 2: Check options

    Only transforms.RandomHorizontalFlip(p=1.0) matches the correct function and parameter style.
  3. Final Answer:

    transforms.RandomHorizontalFlip(p=1.0) -> Option A
  4. Quick Check:

    Correct PyTorch flip = RandomHorizontalFlip [OK]
Hint: Look for 'RandomHorizontalFlip' with probability parameter [OK]
Common Mistakes:
  • Using non-existent transform names
  • Missing the probability parameter
  • Confusing horizontal with vertical flip
3. Given the following code snippet using torchvision transforms, what is the output image size after applying the transforms?
transform = transforms.Compose([
    transforms.Resize((128, 128)),
    transforms.RandomCrop(100),
    transforms.ToTensor()
])

image = Image.open('sample.jpg')
output = transform(image)
print(output.shape)
medium
A. [3, 128, 128]
B. [3, 100, 100]
C. [1, 100, 100]
D. [3, 228, 228]

Solution

  1. Step 1: Analyze each transform step

    First, image is resized to 128x128 pixels with 3 color channels (RGB). Then a random crop of size 100x100 is taken.
  2. Step 2: Determine output tensor shape

    After cropping, the image size is 100x100 with 3 channels. ToTensor() converts it to a tensor with shape [channels, height, width] = [3, 100, 100].
  3. Final Answer:

    [3, 100, 100] -> Option B
  4. Quick Check:

    Resize then crop = final size 100x100 [OK]
Hint: Resize then crop means output size = crop size [OK]
Common Mistakes:
  • Ignoring the crop step size
  • Confusing channel dimension with batch size
  • Assuming crop keeps original size
4. The following code is intended to rotate an image by 45 degrees using torchvision transforms, but it raises an error. What is the mistake?
transform = transforms.Compose([
    transforms.Rotate(45),
    transforms.ToTensor()
])

image = Image.open('sample.jpg')
output = transform(image)
medium
A. transforms.Rotate doesn't exist; should use transforms.functional.rotate or transforms.RandomRotation
B. The angle 45 must be in radians, not degrees
C. ToTensor must come before Rotate
D. Image.open returns a tensor, so transform fails

Solution

  1. Step 1: Check torchvision transform names

    There is no transforms.Rotate class. Rotation is done with transforms.RandomRotation or using functional API.
  2. Step 2: Identify correct usage

    To rotate by a fixed angle, use transforms.RandomRotation([45, 45]) or transforms.functional.rotate. The code as is will cause an AttributeError.
  3. Final Answer:

    transforms.Rotate doesn't exist; should use transforms.functional.rotate or transforms.RandomRotation -> Option A
  4. Quick Check:

    No transforms.Rotate in torchvision [OK]
Hint: Check transform names carefully; Rotate is not a direct class [OK]
Common Mistakes:
  • Using non-existent transform classes
  • Confusing degrees and radians
  • Wrong order of transforms
5. You want to augment a dataset of images to improve model robustness. Which combination of transforms would best simulate real-world variations while keeping image size constant?
hard
A. transforms.RandomCrop(224), transforms.RandomRotation(180), transforms.Resize(128)
B. transforms.Resize(256), transforms.CenterCrop(224), transforms.RandomVerticalFlip() only
C. transforms.RandomRotation(90), transforms.RandomCrop(200), transforms.ToTensor()
D. transforms.RandomResizedCrop(224), transforms.RandomHorizontalFlip(), transforms.ColorJitter(brightness=0.2, contrast=0.2)

Solution

  1. Step 1: Understand augmentation goals

    We want to simulate real-world changes like size, flip, and color while keeping output size fixed.
  2. Step 2: Evaluate options

    transforms.RandomResizedCrop(224), transforms.RandomHorizontalFlip(), transforms.ColorJitter(brightness=0.2, contrast=0.2) resizes and crops randomly to 224x224, flips horizontally, and changes brightness/contrast, all common augmentations that keep size constant.
  3. Step 3: Check other options

    transforms.Resize(256), transforms.CenterCrop(224), transforms.RandomVerticalFlip() only flips vertically and crops but lacks color changes. transforms.RandomRotation(90), transforms.RandomCrop(200), transforms.ToTensor() changes size unpredictably and transforms.RandomCrop(224), transforms.RandomRotation(180), transforms.Resize(128) resizes after cropping, changing size.
  4. Final Answer:

    transforms.RandomResizedCrop(224), transforms.RandomHorizontalFlip(), transforms.ColorJitter(brightness=0.2, contrast=0.2) -> Option D
  5. Quick Check:

    Best augmentations keep size fixed and add variety [OK]
Hint: Pick transforms that keep size fixed and add flip + color changes [OK]
Common Mistakes:
  • Choosing transforms that change image size unpredictably
  • Ignoring color augmentations
  • Using only vertical flips which are less common