Computer Visionml~20 mins

Data augmentation importance in Computer Vision - ML Experiment: Train & Evaluate

Choose your learning style10 modes available

Learn Why Deep Model Try Challenge Experiment Recall Metrics

Start learning this pattern below

Jump into concepts and practice - no test required

Recommended

Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong

Experiment - Data augmentation importance

Problem:We want to train a model to recognize handwritten digits using the MNIST dataset. The current model trains well on the training data but performs poorly on new images.

Current Metrics:Training accuracy: 98%, Validation accuracy: 85%, Validation loss: 0.45

Issue:The model is overfitting. It learns the training data too well but does not generalize to new data.

Your Task

Use data augmentation to reduce overfitting and improve validation accuracy to above 90% while keeping training accuracy below 95%.

You can only add data augmentation techniques during training.

Do not change the model architecture or optimizer.

Keep training epochs and batch size the same.

Hint 1

Hint 2

Hint 3

Solution

Computer Vision

import tensorflow as tf
from tensorflow.keras.datasets import mnist
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Flatten
from tensorflow.keras.preprocessing.image import ImageDataGenerator

# Load data
(X_train, y_train), (X_test, y_test) = mnist.load_data()

# Normalize data
X_train = X_train.astype('float32') / 255.0
X_test = X_test.astype('float32') / 255.0

# Reshape for model input
X_train = X_train.reshape(-1, 28, 28, 1)
X_test = X_test.reshape(-1, 28, 28, 1)

# Define simple model
model = Sequential([
    Flatten(input_shape=(28,28,1)),
    Dense(128, activation='relu'),
    Dense(10, activation='softmax')
])

model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

# Data augmentation setup
datagen = ImageDataGenerator(
    rotation_range=10,
    width_shift_range=0.1,
    height_shift_range=0.1,
    zoom_range=0.1
)

datagen.fit(X_train)

# Train model with augmentation
batch_size = 64
epochs = 10

history = model.fit(
    datagen.flow(X_train, y_train, batch_size=batch_size),
    epochs=epochs,
    validation_data=(X_test, y_test),
    steps_per_epoch=len(X_train) // batch_size
)

Added ImageDataGenerator with rotation, width/height shifts, and zoom augmentation.

Used datagen.flow to feed augmented images during training.

Kept model architecture and training parameters unchanged.

Results Interpretation

Before augmentation: Training accuracy was 98%, validation accuracy was 85%, showing overfitting.

After augmentation: Training accuracy dropped to 93%, validation accuracy improved to 91%, and validation loss decreased, indicating better generalization.

Data augmentation helps the model see more varied examples, reducing overfitting and improving performance on new data.

Bonus Experiment

Try adding more augmentation types like horizontal flips or brightness changes and observe the effect on validation accuracy.

💡 Hint

Be careful with flips on digits as some digits may look different when flipped; test carefully.

Practice

(1/5)

1. Why is data augmentation important in training computer vision models?

easy

A. It increases the variety of training images to help the model generalize better.

B. It reduces the size of the training dataset to speed up training.

C. It removes noisy images from the dataset automatically.

D. It guarantees 100% accuracy on the training data.

Data augmentation importance in Computer Vision - ML Experiment: Train & Evaluate

Start learning this pattern below

Practice

Solution

Step 1: Understand data augmentation purpose

Step 2: Connect augmentation to model learning

Final Answer:

Quick Check:

Solution

Step 1: Recall torchvision syntax for horizontal flip

Step 2: Check each option's correctness

Final Answer:

Quick Check:

Solution

Step 1: Analyze the transform steps

Step 2: Determine tensor shape format

Final Answer:

Quick Check:

Solution

Step 1: Check RandomHorizontalFlip usage

Step 2: Verify other transform usages

Final Answer:

Quick Check:

Solution

Step 1: Consider dataset size and augmentation needs

Step 2: Evaluate augmentation types

Final Answer:

Quick Check: