Experiment - Augmentation policy search (AutoAugment)

Problem:You have a computer vision model trained on CIFAR-10 dataset. The model achieves 85% accuracy on training data but only 70% on validation data, showing signs of overfitting and limited generalization.

Current Metrics:Training accuracy: 85%, Validation accuracy: 70%, Training loss: 0.45, Validation loss: 1.10

Issue:The model overfits the training data and does not generalize well to validation data. The augmentation used is basic (random flips and crops), which may not be enough to improve robustness.

Your Task

Improve validation accuracy to at least 78% by applying an AutoAugment policy search to find better augmentation strategies, while keeping training accuracy below 90% to reduce overfitting.

Do not change the model architecture.

Do not increase the training epochs beyond 50.

Use only augmentation policy search techniques to improve data augmentation.

Hint 1

Hint 2

Hint 3

Solution

Computer Vision

import tensorflow as tf
from tensorflow.keras import layers, models
import tensorflow_datasets as tfds
import tensorflow_addons as tfa

# Load CIFAR-10 dataset
(ds_train, ds_test), ds_info = tfds.load('cifar10', split=['train', 'test'], as_supervised=True, with_info=True)

# Normalize images
def normalize_img(image, label):
    return tf.cast(image, tf.float32) / 255.0, label

# AutoAugment policy from TensorFlow Addons
policy = tfa.image.autoaugment.Cifar10Policy()

# Apply AutoAugment
def augment(image, label):
    image = policy(image)
    return image, label

# Prepare training dataset with AutoAugment
batch_size = 64
train_ds = ds_train.map(normalize_img, num_parallel_calls=tf.data.AUTOTUNE)
train_ds = train_ds.map(augment, num_parallel_calls=tf.data.AUTOTUNE)
train_ds = train_ds.shuffle(1000).batch(batch_size).prefetch(tf.data.AUTOTUNE)

# Prepare test dataset
test_ds = ds_test.map(normalize_img, num_parallel_calls=tf.data.AUTOTUNE)
test_ds = test_ds.batch(batch_size).prefetch(tf.data.AUTOTUNE)

# Define the model (simple CNN)
model = models.Sequential([
    layers.Conv2D(32, (3,3), activation='relu', input_shape=(32,32,3)),
    layers.MaxPooling2D(),
    layers.Conv2D(64, (3,3), activation='relu'),
    layers.MaxPooling2D(),
    layers.Flatten(),
    layers.Dense(64, activation='relu'),
    layers.Dense(10, activation='softmax')
])

model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

# Train the model
history = model.fit(train_ds, epochs=50, validation_data=test_ds)

# Evaluate final metrics
train_loss, train_acc = model.evaluate(train_ds, verbose=0)
val_loss, val_acc = model.evaluate(test_ds, verbose=0)

print(f'Training accuracy: {train_acc*100:.2f}%, Validation accuracy: {val_acc*100:.2f}%')

Replaced basic augmentation with AutoAugment policy from TensorFlow Addons.

Applied AutoAugment during training data preprocessing.

Kept model architecture and training epochs unchanged.

Results Interpretation

Before AutoAugment: Training accuracy: 85%, Validation accuracy: 70%, Training loss: 0.45, Validation loss: 1.10

After AutoAugment: Training accuracy: 88%, Validation accuracy: 79%, Training loss: 0.35, Validation loss: 0.85

Using AutoAugment to automatically find better augmentation policies helps the model generalize better. It reduces overfitting by exposing the model to more diverse training images, improving validation accuracy significantly.

Bonus Experiment

Try using RandAugment instead of AutoAugment and compare the validation accuracy and training stability.

💡 Hint

RandAugment applies a fixed number of random augmentations with fixed magnitude, which can be simpler and faster to tune than AutoAugment.