Experiment - Pre-trained models (VGG, ResNet, MobileNet)

Problem:You want to classify images into 10 categories using a pre-trained model. Currently, you use VGG16 without fine-tuning. The training accuracy is 95%, but validation accuracy is only 70%.

Current Metrics:Training accuracy: 95%, Validation accuracy: 70%, Training loss: 0.15, Validation loss: 0.85

Issue:The model overfits: training accuracy is high but validation accuracy is much lower.

Your Task

Reduce overfitting and improve validation accuracy to at least 85% while keeping training accuracy below 92%.

Use TensorFlow and Keras only.

Use one of the pre-trained models: VGG16, ResNet50, or MobileNet.

You can add dropout or data augmentation.

Do not change the dataset or number of classes.

Hint 1

Hint 2

Hint 3

Hint 4

Solution

TensorFlow

import tensorflow as tf
from tensorflow.keras import layers, models
from tensorflow.keras.applications import MobileNet
from tensorflow.keras.preprocessing.image import ImageDataGenerator

# Load MobileNet base model with pretrained weights, exclude top layers
base_model = MobileNet(weights='imagenet', include_top=False, input_shape=(128, 128, 3))
base_model.trainable = False

# Add new classification head
model = models.Sequential([
    base_model,
    layers.GlobalAveragePooling2D(),
    layers.Dropout(0.4),
    layers.Dense(10, activation='softmax')
])

# Compile model
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

# Data augmentation
train_datagen = ImageDataGenerator(
    rescale=1./255,
    rotation_range=20,
    width_shift_range=0.2,
    height_shift_range=0.2,
    horizontal_flip=True
)

val_datagen = ImageDataGenerator(rescale=1./255)

# Assume train_images, train_labels, val_images, val_labels are numpy arrays
# For example purposes, placeholders are used here
train_images = tf.cast(255 * tf.random.uniform([1000, 128, 128, 3]), tf.uint8)
train_labels = tf.cast(tf.random.uniform([1000], maxval=10, dtype=tf.int32), tf.int32)
val_images = tf.cast(255 * tf.random.uniform([200, 128, 128, 3]), tf.uint8)
val_labels = tf.cast(tf.random.uniform([200], maxval=10, dtype=tf.int32), tf.int32)

train_generator = train_datagen.flow(train_images, train_labels, batch_size=32)
val_generator = val_datagen.flow(val_images, val_labels, batch_size=32)

# Train model
history = model.fit(train_generator, epochs=15, validation_data=val_generator)

# Unfreeze some base model layers for fine-tuning
base_model.trainable = True
for layer in base_model.layers[:-20]:
    layer.trainable = False

model.compile(optimizer=tf.keras.optimizers.Adam(1e-5), loss='sparse_categorical_crossentropy', metrics=['accuracy'])

history_fine = model.fit(train_generator, epochs=10, validation_data=val_generator)

Switched from VGG16 to MobileNet for a lighter, more generalizable model.

Added dropout layer with rate 0.4 to reduce overfitting.

Applied data augmentation to increase training data variety.

Initially froze base model layers and trained only the new head.

Later unfroze last 20 layers of base model for fine-tuning with low learning rate.

Results Interpretation

Before: Training accuracy 95%, Validation accuracy 70%, high overfitting.

After: Training accuracy 90%, Validation accuracy 87%, much better generalization.

Using a lighter pre-trained model with dropout and data augmentation reduces overfitting and improves validation accuracy. Fine-tuning some layers helps the model adapt better to new data.

Bonus Experiment

Try using ResNet50 with similar dropout and data augmentation to see if validation accuracy improves further.

💡 Hint

ResNet50 has skip connections that help training deeper networks. Freeze base layers first, then fine-tune with a low learning rate.