Recall & Review

beginner

What is data augmentation in machine learning?

Data augmentation is a technique to increase the size and diversity of training data by making small changes to existing data, like flipping or rotating images. This helps models learn better and avoid overfitting.

Click to reveal answer

beginner

Name two common image data augmentation techniques.

Two common image data augmentation techniques are flipping (horizontal or vertical) and rotation by small angles. These create new images that help the model see different views of the same object.

Click to reveal answer

intermediate

How does data augmentation help prevent overfitting?

Data augmentation adds variety to training data, so the model doesn't memorize exact examples. This makes the model generalize better to new data, reducing overfitting.

Click to reveal answer

beginner

Show a simple PyTorch code snippet to apply random horizontal flip to images during training.

from torchvision import transforms
transform = transforms.Compose([
    transforms.RandomHorizontalFlip(p=0.5),
    transforms.ToTensor()
])

This code flips images horizontally with 50% chance before converting them to tensors.

Click to reveal answer

intermediate

What is the difference between online and offline data augmentation?

Offline augmentation creates new data files before training, increasing dataset size on disk. Online augmentation applies random changes on the fly during training, saving storage and adding variety each epoch.

Click to reveal answer

Which of the following is NOT a typical data augmentation technique for images?

ARandom rotation

BHorizontal flip

CChanging image file format

DAdding Gaussian noise

Why do we use data augmentation in training machine learning models?

ATo reduce training time

BTo increase dataset size and variety

CTo make the model smaller

DTo remove noisy data

In PyTorch, which module provides common data augmentation transforms?

Atorch.utils.data

Btorch.optim

Ctorch.nn

Dtorchvision.transforms

What does RandomHorizontalFlip(p=0.5) do during training?

AFlips images horizontally with 50% chance

BFlips every image horizontally

CFlips images vertically with 50% chance

DDoes nothing

Which is a benefit of online data augmentation over offline augmentation?

AAdds variety each training epoch

BRequires more disk space

CSlows down training significantly

DCreates fixed augmented dataset

Explain what data augmentation is and why it is useful in training machine learning models.

Describe how you would implement data augmentation in a PyTorch image classification project.

Practice

(1/5)

1. What is the main purpose of data augmentation in PyTorch training pipelines?

easy

A. To reduce the size of the training dataset

B. To create new training data by modifying existing data

C. To speed up model training by skipping data preprocessing

D. To convert data into a different file format

5. You want to augment a dataset of images to improve model robustness. Which combination of transforms would best increase variety without changing image size or color channels?

Options:
A) RandomHorizontalFlip(p=0.5) + RandomRotation(15) + ColorJitter(brightness=0.2)
B) RandomResizedCrop(size=224) + Grayscale(num_output_channels=1)
C) RandomVerticalFlip(p=1.0) + RandomRotation(90) + ToTensor()
D) Resize(128) + RandomCrop(64) + RandomHorizontalFlip(p=0.5)

hard

A. Resize and crop to smaller size (changes image size)

B. RandomResizedCrop and converting to grayscale (changes size and channels)

C. Vertical flip and 90-degree rotation (may change orientation drastically)

D. RandomHorizontalFlip, small RandomRotation, and ColorJitter to vary brightness

Data augmentation in PyTorch - Cheat Sheet & Quick Revision

Start learning this pattern below

Practice

Solution

Step 1: Understand data augmentation concept

Step 2: Identify the purpose in training

Final Answer:

Quick Check:

Solution

Step 1: Recall torchvision transform syntax

Step 2: Match correct syntax

Final Answer:

Quick Check:

Solution

Step 1: Understand transforms.Compose and RandomRotation

Step 2: Determine output tensor shape

Final Answer:

Quick Check:

Solution

Step 1: Check RandomHorizontalFlip usage

Step 2: Verify other transforms

Final Answer:

Quick Check:

Solution

Step 1: Analyze each option's effect on size and channels

Step 2: Choose the option that keeps size and channels but increases variety

Final Answer:

Quick Check: