Experiment - Fine-tuning approach

Problem:We want to classify images of cats and dogs using a neural network. Currently, we use a pre-trained model but train all layers from scratch on a small dataset.

Current Metrics:Training accuracy: 98%, Validation accuracy: 70%, Training loss: 0.05, Validation loss: 0.85

Issue:The model overfits: training accuracy is very high but validation accuracy is low, showing poor generalization.

Your Task

Reduce overfitting by fine-tuning only the last layers of the pre-trained model and improve validation accuracy to above 85% while keeping training accuracy below 92%.

Use the same pre-trained model architecture (MobileNetV2).

Freeze the base layers and only train the top layers initially.

Use the same dataset and batch size.

Hint 1

Hint 2

Hint 3

Hint 4

Solution

Computer Vision

import tensorflow as tf
from tensorflow.keras import layers, models
from tensorflow.keras.applications import MobileNetV2
from tensorflow.keras.preprocessing.image import ImageDataGenerator

# Load pre-trained MobileNetV2 without top layers
base_model = MobileNetV2(input_shape=(160, 160, 3), include_top=False, weights='imagenet')
base_model.trainable = False  # Freeze base

# Add new classification head
model = models.Sequential([
    base_model,
    layers.GlobalAveragePooling2D(),
    layers.Dropout(0.3),
    layers.Dense(1, activation='sigmoid')
])

model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

# Data augmentation
train_datagen = ImageDataGenerator(rescale=1./255, rotation_range=20, width_shift_range=0.2,
                                   height_shift_range=0.2, horizontal_flip=True, validation_split=0.2)

train_generator = train_datagen.flow_from_directory(
    'cats_and_dogs/train', target_size=(160, 160), batch_size=32, class_mode='binary', subset='training')

validation_generator = train_datagen.flow_from_directory(
    'cats_and_dogs/train', target_size=(160, 160), batch_size=32, class_mode='binary', subset='validation')

# Train only new layers
history = model.fit(train_generator, epochs=10, validation_data=validation_generator)

# Unfreeze some base layers for fine-tuning
base_model.trainable = True
for layer in base_model.layers[:-20]:
    layer.trainable = False

model.compile(optimizer=tf.keras.optimizers.Adam(1e-5), loss='binary_crossentropy', metrics=['accuracy'])

# Continue training
history_fine = model.fit(train_generator, epochs=10, validation_data=validation_generator)

Frozen the base MobileNetV2 layers to keep pre-trained features.

Added a new classification head with dropout to reduce overfitting.

Used data augmentation to increase data variety.

Trained only the new layers first, then unfroze last 20 base layers for fine-tuning with a low learning rate.

Results Interpretation

Before fine-tuning: Training accuracy was 98% but validation accuracy was only 70%, showing overfitting.

After fine-tuning: Training accuracy reduced to 90%, validation accuracy improved to 87%, and validation loss decreased significantly.

Freezing pre-trained layers and training only new layers helps prevent overfitting on small datasets. Fine-tuning some base layers with a low learning rate further improves validation performance by adapting features to the new task.

Bonus Experiment

Try using a different pre-trained model like EfficientNetB0 and compare validation accuracy and training time.

💡 Hint

EfficientNet models are efficient and may give better accuracy with fewer parameters. Use the same fine-tuning approach.