0
0
TensorFlowml~20 mins

Data augmentation in pipeline in TensorFlow - ML Experiment: Train & Evaluate

Choose your learning style9 modes available
Experiment - Data augmentation in pipeline
Problem:You want to improve your image classification model by making it more robust to variations in input images.
Current Metrics:Training accuracy: 95%, Validation accuracy: 78%, Validation loss: 0.85
Issue:The model overfits the training data and performs poorly on validation data due to lack of input variety.
Your Task
Add data augmentation to the training pipeline to reduce overfitting and improve validation accuracy to above 85%.
Do not change the model architecture.
Keep the number of training epochs the same.
Use TensorFlow's data augmentation layers in the input pipeline.
Hint 1
Hint 2
Hint 3
Solution
TensorFlow
import tensorflow as tf
from tensorflow.keras import layers, models

# Load example dataset
(train_images, train_labels), (val_images, val_labels) = tf.keras.datasets.cifar10.load_data()

# Normalize images
train_images = train_images.astype('float32') / 255.0
val_images = val_images.astype('float32') / 255.0

# Define data augmentation pipeline
data_augmentation = tf.keras.Sequential([
    layers.RandomFlip('horizontal'),
    layers.RandomRotation(0.1),
    layers.RandomZoom(0.1),
])

# Build model
model = models.Sequential([
    layers.Input(shape=(32, 32, 3)),
    data_augmentation,  # Apply augmentation here
    layers.Conv2D(32, (3, 3), activation='relu'),
    layers.MaxPooling2D(),
    layers.Conv2D(64, (3, 3), activation='relu'),
    layers.MaxPooling2D(),
    layers.Flatten(),
    layers.Dense(64, activation='relu'),
    layers.Dense(10, activation='softmax')
])

model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

# Train model
history = model.fit(train_images, train_labels, epochs=10, batch_size=64, validation_data=(val_images, val_labels))
Added a data augmentation layer sequence with RandomFlip, RandomRotation, and RandomZoom.
Inserted the augmentation layers at the start of the model to augment training images on the fly.
Kept model architecture and training parameters unchanged.
Results Interpretation

Before augmentation: Training accuracy was 95%, validation accuracy was 78%, showing overfitting.

After augmentation: Training accuracy decreased to 90%, validation accuracy improved to 87%, and validation loss decreased, indicating better generalization.

Data augmentation increases input variety, helping the model learn more robust features and reducing overfitting.
Bonus Experiment
Try adding more augmentation types like RandomContrast and RandomTranslation to see if validation accuracy improves further.
💡 Hint
Add these layers to the data_augmentation Sequential model and observe changes in validation metrics.