TensorFlowml~20 mins

Why regularization prevents overfitting in TensorFlow - Experiment to Prove It

Choose your learning style9 modes available

Learn Why Deep Model Try Challenge Experiment Recall Metrics

Experiment - Why regularization prevents overfitting

Problem:We want to classify images of handwritten digits using a neural network. The current model fits the training data very well but performs poorly on new data.

Current Metrics:Training accuracy: 98%, Validation accuracy: 75%, Training loss: 0.05, Validation loss: 0.85

Issue:The model is overfitting: it memorizes training data but does not generalize well to validation data.

Your Task

Reduce overfitting by applying regularization techniques so that validation accuracy improves to at least 85% while keeping training accuracy below 95%.

You can only add L2 regularization and dropout layers to the model.

Do not change the model architecture or dataset.

Hint 1

Hint 2

Hint 3

Solution

TensorFlow

import tensorflow as tf
from tensorflow.keras import layers, models, regularizers

# Load dataset
mnist = tf.keras.datasets.mnist
(X_train, y_train), (X_test, y_test) = mnist.load_data()

# Normalize data
X_train, X_test = X_train / 255.0, X_test / 255.0

# Flatten images
X_train = X_train.reshape(-1, 28*28)
X_test = X_test.reshape(-1, 28*28)

# Build model with L2 regularization and dropout
model = models.Sequential([
    layers.Dense(128, activation='relu', kernel_regularizer=regularizers.l2(0.001), input_shape=(28*28,)),
    layers.Dropout(0.3),
    layers.Dense(64, activation='relu', kernel_regularizer=regularizers.l2(0.001)),
    layers.Dropout(0.3),
    layers.Dense(10, activation='softmax')
])

model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

# Train model
history = model.fit(X_train, y_train, epochs=20, batch_size=64, validation_split=0.2, verbose=0)

# Evaluate on test data
test_loss, test_acc = model.evaluate(X_test, y_test, verbose=0)

print(f'Test accuracy: {test_acc*100:.2f}%', f'Test loss: {test_loss:.4f}')

Added L2 regularization with factor 0.001 to Dense layers to penalize large weights.

Inserted Dropout layers with rate 0.3 after each hidden Dense layer to randomly ignore neurons during training.

Kept same model architecture and training parameters to isolate effect of regularization.

Results Interpretation

Before regularization: Training accuracy was very high (98%) but validation accuracy was low (75%), showing overfitting.

After regularization: Training accuracy decreased to 92%, but validation accuracy improved to 87%, indicating better generalization.

Regularization methods like L2 and dropout reduce overfitting by limiting model complexity and forcing it to learn more general patterns instead of memorizing training data.

Bonus Experiment

Try using only dropout or only L2 regularization separately and compare their effects on overfitting.

💡 Hint

Remove one regularization method at a time and observe changes in validation accuracy and loss.