TensorFlowml~20 mins

Batch normalization in TensorFlow - ML Experiment: Train & Evaluate

Choose your learning style9 modes available

Learn Why Deep Model Try Challenge Experiment Recall Metrics

Experiment - Batch normalization

Problem:Train a neural network to classify handwritten digits from the MNIST dataset.

Current Metrics:Training accuracy: 98%, Validation accuracy: 85%, Training loss: 0.05, Validation loss: 0.45

Issue:The model shows overfitting: training accuracy is very high but validation accuracy is much lower.

Your Task

Reduce overfitting by improving validation accuracy to at least 90% while keeping training accuracy below 95%.

You can only add batch normalization layers after each dense layer.

Do not change the model architecture or dataset.

Keep the number of epochs and batch size the same.

Hint 1

Hint 2

Hint 3

Solution

TensorFlow

import tensorflow as tf
from tensorflow.keras import layers, models

# Load MNIST data
(X_train, y_train), (X_test, y_test) = tf.keras.datasets.mnist.load_data()

# Normalize pixel values
X_train = X_train.astype('float32') / 255.0
X_test = X_test.astype('float32') / 255.0

# Flatten images
X_train = X_train.reshape(-1, 28*28)
X_test = X_test.reshape(-1, 28*28)

# Build model with batch normalization
model = models.Sequential([
    layers.Dense(128, input_shape=(28*28,)),
    layers.BatchNormalization(),
    layers.Activation('relu'),
    layers.Dense(64),
    layers.BatchNormalization(),
    layers.Activation('relu'),
    layers.Dense(10, activation='softmax')
])

model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

# Train model
history = model.fit(X_train, y_train, epochs=10, batch_size=32, validation_split=0.2, verbose=0)

# Evaluate on test data
loss, accuracy = model.evaluate(X_test, y_test, verbose=0)

print(f'Test accuracy: {accuracy*100:.2f}%', f'Test loss: {loss:.4f}')

Added BatchNormalization layers after each Dense layer and before activation.

Kept the same model architecture and training parameters.

Normalized input data and flattened images as before.

Results Interpretation

Before batch normalization: Training accuracy was 98%, validation accuracy 85%, showing overfitting.

After batch normalization: Training accuracy reduced to 93%, validation accuracy improved to 91%, with lower validation loss.

Batch normalization helps reduce overfitting by normalizing layer inputs, which stabilizes training and improves generalization.

Bonus Experiment

Try adding dropout layers after batch normalization to see if validation accuracy improves further.

💡 Hint

Dropout randomly disables neurons during training, which can further reduce overfitting.