TensorFlowml~5 mins

Data augmentation as regularization in TensorFlow

Choose your learning style9 modes available

Learn Why Deep Model Try Challenge Experiment Recall Metrics

Introduction

Data augmentation helps the model learn better by showing it many different versions of the same data. This stops the model from just memorizing and makes it better at guessing new data.

When you have a small set of images and want the model to learn more from them.

When your model starts to memorize training data and performs poorly on new data.

When you want to improve the model's ability to handle real-world variations like rotations or brightness changes.

When training a model for tasks like image classification or object detection.

When you want to add simple tricks to improve model accuracy without changing the model itself.

Syntax

TensorFlow

data_augmentation = tf.keras.Sequential([
    tf.keras.layers.RandomFlip('horizontal'),
    tf.keras.layers.RandomRotation(0.1),
    tf.keras.layers.RandomZoom(0.1),
])

augmented_images = data_augmentation(original_images)

Use tf.keras.Sequential to stack augmentation layers.

Apply augmentation only during training, not during testing or validation.

Examples

This example flips images both horizontally and vertically and rotates them up to 20% of a full circle.

TensorFlow

data_augmentation = tf.keras.Sequential([
    tf.keras.layers.RandomFlip('horizontal_and_vertical'),
    tf.keras.layers.RandomRotation(0.2),
])

This example changes image contrast and zooms in or out randomly.

TensorFlow

data_augmentation = tf.keras.Sequential([
    tf.keras.layers.RandomContrast(0.2),
    tf.keras.layers.RandomZoom(0.15),
])

Sample Model

This code creates two random images and applies data augmentation layers to them. It prints the average pixel values before and after augmentation to show the images changed.

TensorFlow

import tensorflow as tf
import numpy as np

# Create dummy image data: 2 images, 64x64 pixels, 3 color channels
original_images = tf.random.uniform(shape=(2, 64, 64, 3))

# Define data augmentation layers
data_augmentation = tf.keras.Sequential([
    tf.keras.layers.RandomFlip('horizontal'),
    tf.keras.layers.RandomRotation(0.1),
    tf.keras.layers.RandomZoom(0.1),
])

# Apply augmentation
augmented_images = data_augmentation(original_images)

# Show difference between original and augmented images (mean pixel values)
print(f"Original images mean pixel value: {tf.reduce_mean(original_images).numpy():.4f}")
print(f"Augmented images mean pixel value: {tf.reduce_mean(augmented_images).numpy():.4f}")

OutputSuccess

Important Notes

Data augmentation works best when applied only during training, not during validation or testing.

Augmentation can slow down training because it changes data on the fly.

Try different augmentation types to see what helps your model most.

Summary

Data augmentation creates new training data by changing existing data slightly.

This helps the model learn better and avoid memorizing the training data.

Use TensorFlow's built-in layers like RandomFlip and RandomRotation to add augmentation easily.