Practice

(1/5)

1. What is the main purpose of image augmentation in training machine learning models?

easy

A. To reduce the size of the training dataset

B. To remove noise from images

C. To create more varied training images by modifying originals

D. To convert images to grayscale only

Solution

Step 1: Understand image augmentation
Image augmentation means making small changes to original images to create new ones.
Step 2: Purpose in training
This helps models see more variety and learn better, avoiding overfitting.
Final Answer:
To create more varied training images by modifying originals -> Option C
Quick Check:
Image augmentation = create varied images [OK]

Hint: Augmentation means changing images to get more training data [OK]

Common Mistakes:

Thinking augmentation reduces dataset size
Confusing augmentation with noise removal
Assuming augmentation only changes color

2. Which of the following is the correct way to apply a horizontal flip using PyTorch's torchvision transforms?

easy

A. transforms.RandomHorizontalFlip(p=1.0)

B. transforms.HorizontalFlip()

C. transforms.FlipHorizontal()

D. transforms.RandomFlip(direction='horizontal')

Solution

Step 1: Recall torchvision syntax
PyTorch uses transforms.RandomHorizontalFlip(p=probability) to flip images horizontally.
Step 2: Check options
Only transforms.RandomHorizontalFlip(p=1.0) matches the correct function and parameter style.
Final Answer:
transforms.RandomHorizontalFlip(p=1.0) -> Option A
Quick Check:
Correct PyTorch flip = RandomHorizontalFlip [OK]

Hint: Look for 'RandomHorizontalFlip' with probability parameter [OK]

Common Mistakes:

Using non-existent transform names
Missing the probability parameter
Confusing horizontal with vertical flip

3. Given the following code snippet using torchvision transforms, what is the output image size after applying the transforms?

transform = transforms.Compose([
    transforms.Resize((128, 128)),
    transforms.RandomCrop(100),
    transforms.ToTensor()
])

image = Image.open('sample.jpg')
output = transform(image)
print(output.shape)

medium

A. [3, 128, 128]

B. [3, 100, 100]

C. [1, 100, 100]

D. [3, 228, 228]

Solution

Step 1: Analyze each transform step
First, image is resized to 128x128 pixels with 3 color channels (RGB). Then a random crop of size 100x100 is taken.
Step 2: Determine output tensor shape
After cropping, the image size is 100x100 with 3 channels. ToTensor() converts it to a tensor with shape [channels, height, width] = [3, 100, 100].
Final Answer:
[3, 100, 100] -> Option B
Quick Check:
Resize then crop = final size 100x100 [OK]

Hint: Resize then crop means output size = crop size [OK]

Common Mistakes:

Ignoring the crop step size
Confusing channel dimension with batch size
Assuming crop keeps original size

4. The following code is intended to rotate an image by 45 degrees using torchvision transforms, but it raises an error. What is the mistake?

transform = transforms.Compose([
    transforms.Rotate(45),
    transforms.ToTensor()
])

image = Image.open('sample.jpg')
output = transform(image)

medium

A. transforms.Rotate doesn't exist; should use transforms.functional.rotate or transforms.RandomRotation

B. The angle 45 must be in radians, not degrees

C. ToTensor must come before Rotate

D. Image.open returns a tensor, so transform fails

Solution

Step 1: Check torchvision transform names
There is no transforms.Rotate class. Rotation is done with transforms.RandomRotation or using functional API.
Step 2: Identify correct usage
To rotate by a fixed angle, use transforms.RandomRotation([45, 45]) or transforms.functional.rotate. The code as is will cause an AttributeError.
Final Answer:
transforms.Rotate doesn't exist; should use transforms.functional.rotate or transforms.RandomRotation -> Option A
Quick Check:
No transforms.Rotate in torchvision [OK]

Hint: Check transform names carefully; Rotate is not a direct class [OK]

Common Mistakes:

Using non-existent transform classes
Confusing degrees and radians
Wrong order of transforms

5. You want to augment a dataset of images to improve model robustness. Which combination of transforms would best simulate real-world variations while keeping image size constant?

hard

A. transforms.RandomCrop(224), transforms.RandomRotation(180), transforms.Resize(128)

B. transforms.Resize(256), transforms.CenterCrop(224), transforms.RandomVerticalFlip() only

C. transforms.RandomRotation(90), transforms.RandomCrop(200), transforms.ToTensor()

D. transforms.RandomResizedCrop(224), transforms.RandomHorizontalFlip(), transforms.ColorJitter(brightness=0.2, contrast=0.2)

Solution

Step 1: Understand augmentation goals
We want to simulate real-world changes like size, flip, and color while keeping output size fixed.
Step 2: Evaluate options
transforms.RandomResizedCrop(224), transforms.RandomHorizontalFlip(), transforms.ColorJitter(brightness=0.2, contrast=0.2) resizes and crops randomly to 224x224, flips horizontally, and changes brightness/contrast, all common augmentations that keep size constant.
Step 3: Check other options
transforms.Resize(256), transforms.CenterCrop(224), transforms.RandomVerticalFlip() only flips vertically and crops but lacks color changes. transforms.RandomRotation(90), transforms.RandomCrop(200), transforms.ToTensor() changes size unpredictably and transforms.RandomCrop(224), transforms.RandomRotation(180), transforms.Resize(128) resizes after cropping, changing size.
Final Answer:
transforms.RandomResizedCrop(224), transforms.RandomHorizontalFlip(), transforms.ColorJitter(brightness=0.2, contrast=0.2) -> Option D
Quick Check:
Best augmentations keep size fixed and add variety [OK]

Hint: Pick transforms that keep size fixed and add flip + color changes [OK]

Common Mistakes:

Choosing transforms that change image size unpredictably
Ignoring color augmentations
Using only vertical flips which are less common

Image augmentation transforms in Computer Vision - Model Metrics & Evaluation

Start learning this pattern below

Practice

Solution

Step 1: Understand image augmentation

Step 2: Purpose in training

Final Answer:

Quick Check:

Solution

Step 1: Recall torchvision syntax

Step 2: Check options

Final Answer:

Quick Check:

Solution

Step 1: Analyze each transform step

Step 2: Determine output tensor shape

Final Answer:

Quick Check:

Solution

Step 1: Check torchvision transform names

Step 2: Identify correct usage

Final Answer:

Quick Check:

Solution

Step 1: Understand augmentation goals

Step 2: Evaluate options

Step 3: Check other options

Final Answer:

Quick Check: