Bird
Raised Fist0
TensorFlowml~20 mins

Data augmentation in pipeline in TensorFlow - ML Experiment: Train & Evaluate

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Experiment - Data augmentation in pipeline
Problem:You want to improve your image classification model by making it more robust to variations in input images.
Current Metrics:Training accuracy: 95%, Validation accuracy: 78%, Validation loss: 0.85
Issue:The model overfits the training data and performs poorly on validation data due to lack of input variety.
Your Task
Add data augmentation to the training pipeline to reduce overfitting and improve validation accuracy to above 85%.
Do not change the model architecture.
Keep the number of training epochs the same.
Use TensorFlow's data augmentation layers in the input pipeline.
Hint 1
Hint 2
Hint 3
Solution
TensorFlow
import tensorflow as tf
from tensorflow.keras import layers, models

# Load example dataset
(train_images, train_labels), (val_images, val_labels) = tf.keras.datasets.cifar10.load_data()

# Normalize images
train_images = train_images.astype('float32') / 255.0
val_images = val_images.astype('float32') / 255.0

# Define data augmentation pipeline
data_augmentation = tf.keras.Sequential([
    layers.RandomFlip('horizontal'),
    layers.RandomRotation(0.1),
    layers.RandomZoom(0.1),
])

# Build model
model = models.Sequential([
    layers.Input(shape=(32, 32, 3)),
    data_augmentation,  # Apply augmentation here
    layers.Conv2D(32, (3, 3), activation='relu'),
    layers.MaxPooling2D(),
    layers.Conv2D(64, (3, 3), activation='relu'),
    layers.MaxPooling2D(),
    layers.Flatten(),
    layers.Dense(64, activation='relu'),
    layers.Dense(10, activation='softmax')
])

model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

# Train model
history = model.fit(train_images, train_labels, epochs=10, batch_size=64, validation_data=(val_images, val_labels))
Added a data augmentation layer sequence with RandomFlip, RandomRotation, and RandomZoom.
Inserted the augmentation layers at the start of the model to augment training images on the fly.
Kept model architecture and training parameters unchanged.
Results Interpretation

Before augmentation: Training accuracy was 95%, validation accuracy was 78%, showing overfitting.

After augmentation: Training accuracy decreased to 90%, validation accuracy improved to 87%, and validation loss decreased, indicating better generalization.

Data augmentation increases input variety, helping the model learn more robust features and reducing overfitting.
Bonus Experiment
Try adding more augmentation types like RandomContrast and RandomTranslation to see if validation accuracy improves further.
💡 Hint
Add these layers to the data_augmentation Sequential model and observe changes in validation metrics.

Practice

(1/5)
1. What is the main purpose of data augmentation in a TensorFlow training pipeline?
easy
A. To speed up the training process by skipping some images
B. To reduce the size of the training dataset
C. To create more varied training data by randomly changing original images
D. To convert images into grayscale only

Solution

  1. Step 1: Understand data augmentation concept

    Data augmentation creates new training images by applying random changes like flips or rotations to original images.
  2. Step 2: Identify the purpose in training pipeline

    This helps the model see more varied examples, improving learning and reducing overfitting.
  3. Final Answer:

    To create more varied training data by randomly changing original images -> Option C
  4. Quick Check:

    Data augmentation = varied training data [OK]
Hint: Augmentation adds variety to training images [OK]
Common Mistakes:
  • Thinking augmentation reduces dataset size
  • Believing augmentation speeds training by skipping data
  • Assuming augmentation only converts images to grayscale
2. Which of the following is the correct way to add a random flip augmentation layer in a TensorFlow Sequential pipeline?
easy
A. tf.keras.Sequential([tf.keras.layers.RandomFlip('horizontal')])
B. tf.keras.Sequential([tf.keras.layers.FlipRandom('horizontal')])
C. tf.keras.Sequential([tf.keras.layers.RandomFlip(mode='vertical')])
D. tf.keras.Sequential([tf.keras.layers.RandomFlip('diagonal')])

Solution

  1. Step 1: Recall TensorFlow augmentation syntax

    The correct layer is RandomFlip with argument 'horizontal' or 'vertical' as a string.
  2. Step 2: Check each option

    tf.keras.Sequential([tf.keras.layers.RandomFlip('horizontal')]) uses correct class and argument. tf.keras.Sequential([tf.keras.layers.FlipRandom('horizontal')]) uses wrong class name. tf.keras.Sequential([tf.keras.layers.RandomFlip(mode='vertical')]) uses keyword argument 'mode' which is invalid. tf.keras.Sequential([tf.keras.layers.RandomFlip('diagonal')]) uses unsupported flip mode 'diagonal'.
  3. Final Answer:

    tf.keras.Sequential([tf.keras.layers.RandomFlip('horizontal')]) -> Option A
  4. Quick Check:

    Correct layer and argument = tf.keras.Sequential([tf.keras.layers.RandomFlip('horizontal')]) [OK]
Hint: Use RandomFlip('horizontal') exactly as named [OK]
Common Mistakes:
  • Using wrong layer class name
  • Passing arguments with wrong keywords
  • Using unsupported flip modes
3. Given the following TensorFlow code snippet, what will be the output shape of the augmented images?
import tensorflow as tf
aug = tf.keras.Sequential([
  tf.keras.layers.RandomFlip('horizontal'),
  tf.keras.layers.RandomRotation(0.1)
])
input_image = tf.random.uniform([1, 128, 128, 3])
output_image = aug(input_image)
print(output_image.shape)
medium
A. (1, 128, 128, 3)
B. (128, 128, 3)
C. (1, 256, 256, 3)
D. (1, 128, 128)

Solution

  1. Step 1: Understand input and augmentation layers

    Input shape is (1, 128, 128, 3) meaning batch size 1, 128x128 image with 3 color channels. RandomFlip and RandomRotation do not change image size.
  2. Step 2: Check output shape after augmentation

    Augmentation layers keep the shape same, so output shape remains (1, 128, 128, 3).
  3. Final Answer:

    (1, 128, 128, 3) -> Option A
  4. Quick Check:

    Augmentation keeps shape = (1, 128, 128, 3) [OK]
Hint: Augmentation layers keep input shape unchanged [OK]
Common Mistakes:
  • Assuming rotation changes image size
  • Ignoring batch dimension in output
  • Dropping color channels
4. Identify the error in this TensorFlow data augmentation pipeline code:
import tensorflow as tf
aug = tf.keras.Sequential([
  tf.keras.layers.RandomFlip('horizontal'),
  tf.keras.layers.RandomRotation(0.2, 0.3)
])
medium
A. Missing input shape in Sequential
B. RandomFlip does not accept 'horizontal' as argument
C. Sequential cannot contain augmentation layers
D. RandomRotation requires a single float or tuple, not two separate floats

Solution

  1. Step 1: Check RandomRotation layer arguments

    RandomRotation expects either a single float or a tuple like (min_factor, max_factor). Passing two separate floats is invalid.
  2. Step 2: Verify other parts

    RandomFlip('horizontal') is valid. Sequential can contain augmentation layers. Input shape is optional here.
  3. Final Answer:

    RandomRotation requires a single float or tuple, not two separate floats -> Option D
  4. Quick Check:

    RandomRotation argument format error = RandomRotation requires a single float or tuple, not two separate floats [OK]
Hint: RandomRotation needs one float or tuple, not two floats [OK]
Common Mistakes:
  • Passing multiple floats instead of tuple to RandomRotation
  • Thinking RandomFlip argument is invalid
  • Believing Sequential can't hold augmentation layers
5. You want to build a TensorFlow data augmentation pipeline that randomly flips images horizontally, rotates them by up to 20%, and zooms in or out by up to 10%. Which of the following code snippets correctly implements this pipeline?
hard
A. tf.keras.Sequential([ tf.keras.layers.RandomFlip('horizontal'), tf.keras.layers.RandomRotation(0.2), tf.keras.layers.RandomZoom((0.1, 0.2)) ])
B. tf.keras.Sequential([ tf.keras.layers.RandomFlip('horizontal'), tf.keras.layers.RandomRotation(0.2), tf.keras.layers.RandomZoom(0.1) ])
C. tf.keras.Sequential([ tf.keras.layers.RandomFlip('horizontal'), tf.keras.layers.RandomRotation(0.02), tf.keras.layers.RandomZoom(10) ])
D. tf.keras.Sequential([ tf.keras.layers.RandomFlip('vertical'), tf.keras.layers.RandomRotation(20), tf.keras.layers.RandomZoom((0.1, 0.1)) ])

Solution

  1. Step 1: Check flip and rotation parameters

    RandomFlip('horizontal') is correct. RandomRotation expects a float fraction (0.2 means 20%).
  2. Step 2: Check zoom parameters

    RandomZoom(0.1) means zoom in/out by 10%. tf.keras.Sequential([ tf.keras.layers.RandomFlip('horizontal'), tf.keras.layers.RandomRotation(0.2), tf.keras.layers.RandomZoom((0.1, 0.2)) ]) uses zoom (0.1, 0.2) which is uneven zoom, not requested.
  3. Final Answer:

    tf.keras.Sequential([ tf.keras.layers.RandomFlip('horizontal'), tf.keras.layers.RandomRotation(0.2), tf.keras.layers.RandomZoom(0.1) ]) -> Option B
  4. Quick Check:

    Correct flip, rotation fraction, and zoom float = tf.keras.Sequential([ tf.keras.layers.RandomFlip('horizontal'), tf.keras.layers.RandomRotation(0.2), tf.keras.layers.RandomZoom(0.1) ]) [OK]
Hint: Use fractions for rotation and single float for zoom [OK]
Common Mistakes:
  • Using degrees instead of fraction for rotation
  • Passing large numbers to zoom
  • Choosing wrong flip direction