Recall & Review
beginner
What is data augmentation in a machine learning pipeline?
Data augmentation is a technique to create new training data by making small changes to existing data. It helps the model learn better by showing it more varied examples.
Click to reveal answer
beginner
Why do we use data augmentation in the training pipeline?
We use data augmentation to increase the size and diversity of the training data. This reduces overfitting and helps the model generalize better to new data.
Click to reveal answer
beginner
Name three common data augmentation techniques for images.
Common techniques include flipping images horizontally, rotating images by small angles, and zooming in or out slightly.
Click to reveal answer
intermediate
How can data augmentation be integrated into a TensorFlow pipeline?
In TensorFlow, data augmentation can be added as part of the tf.data pipeline using functions like map() to apply augmentation operations on the fly during training.
Click to reveal answer
intermediate
What is the benefit of applying data augmentation on the fly during training instead of beforehand?
Applying augmentation on the fly saves storage space and creates new variations each epoch, making the training data more diverse without needing to save all augmented images.
Click to reveal answer
Which of the following is NOT a typical image data augmentation technique?
✗ Incorrect
Sorting pixels by brightness is not a data augmentation technique; it changes the image content drastically and is not used for augmentation.
In TensorFlow, which method is commonly used to apply data augmentation in a pipeline?
✗ Incorrect
tf.data.Dataset.map() applies a function to each element in the dataset, making it ideal for applying augmentation on the fly.
What is a key advantage of using data augmentation during training?
✗ Incorrect
Data augmentation increases the diversity of training data, helping the model generalize better.
Which statement about data augmentation is TRUE?
✗ Incorrect
Data augmentation helps prevent overfitting by providing more varied training examples.
When applying data augmentation on the fly, what happens each training epoch?
✗ Incorrect
On-the-fly augmentation applies new random changes each epoch, increasing data variety.
Explain how data augmentation improves model training and how it can be implemented in a TensorFlow pipeline.
Think about why showing the model more varied images helps it learn better.
You got /4 concepts.
Describe the difference between applying data augmentation before training and applying it on the fly during training.
Consider storage needs and data variety over multiple training passes.
You got /3 concepts.