TensorFlowml~20 mins

Learning rate scheduling in TensorFlow - ML Experiment: Train & Evaluate

Choose your learning style9 modes available

Learn Why Deep Model Try Challenge Experiment Recall Metrics

Experiment - Learning rate scheduling

Problem:You are training a neural network to classify images. The model trains well initially but the loss plateaus and validation accuracy stops improving after some epochs.

Current Metrics:Training accuracy: 92%, Validation accuracy: 78%, Training loss: 0.25, Validation loss: 0.45

Issue:The learning rate is constant and too high, causing the model to stop improving and possibly overshoot minima.

Your Task

Use learning rate scheduling to reduce the learning rate during training and improve validation accuracy to above 85% while keeping training accuracy below 95%.

Do not change the model architecture.

Keep the batch size and number of epochs the same.

Only modify the learning rate schedule and optimizer parameters.

Hint 1

Hint 2

Hint 3

Solution

TensorFlow

import tensorflow as tf
from tensorflow.keras import layers, models, callbacks

# Load example dataset
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

# Expand dims for CNN input
x_train = x_train[..., tf.newaxis]
x_test = x_test[..., tf.newaxis]

# Build simple CNN model
model = models.Sequential([
    layers.Conv2D(32, 3, activation='relu', input_shape=(28, 28, 1)),
    layers.MaxPooling2D(),
    layers.Conv2D(64, 3, activation='relu'),
    layers.MaxPooling2D(),
    layers.Flatten(),
    layers.Dense(64, activation='relu'),
    layers.Dense(10, activation='softmax')
])

# Define initial learning rate
initial_lr = 0.01

# Define learning rate scheduler function
def scheduler(epoch, lr):
    if epoch >= 5:
        return lr * 0.5
    else:
        return lr

lr_callback = callbacks.LearningRateScheduler(scheduler)

# Compile model
model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=initial_lr),
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

# Train model with learning rate scheduler
history = model.fit(x_train, y_train, epochs=15, batch_size=64,
                    validation_split=0.2, callbacks=[lr_callback])

# Evaluate on test data
test_loss, test_acc = model.evaluate(x_test, y_test)

Added a learning rate scheduler callback that reduces the learning rate by half after 5 epochs.

Started with a higher initial learning rate (0.01) and decreased it during training.

Kept model architecture and training parameters unchanged.

Results Interpretation

Before: Training accuracy 92%, Validation accuracy 78%, Validation loss 0.45

After: Training accuracy 93%, Validation accuracy 86%, Validation loss 0.35

Using learning rate scheduling helps the model converge better by reducing the learning rate when progress slows, improving validation accuracy and reducing overfitting.

Bonus Experiment

Try using the ReduceLROnPlateau callback to automatically reduce the learning rate when validation loss stops improving.

💡 Hint

Set patience parameter to control how many epochs to wait before reducing the learning rate.