TensorFlowml~20 mins

Optimizers (SGD, Adam, RMSprop) in TensorFlow - ML Experiment: Train & Evaluate

Choose your learning style10 modes available

Learn Why Deep Model Try Challenge Experiment Recall Metrics

Start learning this pattern below

Jump into concepts and practice - no test required

Recommended

Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong

Experiment - Optimizers (SGD, Adam, RMSprop)

Problem:Train a simple neural network on the MNIST dataset to classify handwritten digits.

Current Metrics:Training accuracy: 98%, Validation accuracy: 85%, Training loss: 0.05, Validation loss: 0.45

Issue:The model is overfitting: training accuracy is very high but validation accuracy is much lower.

Your Task

Reduce overfitting by changing the optimizer to improve validation accuracy to above 90% while keeping training accuracy below 95%.

Keep the model architecture the same.

Only change the optimizer and its hyperparameters.

Use TensorFlow 2.x API.

Hint 1

Hint 2

Hint 3

Solution

TensorFlow

import tensorflow as tf
from tensorflow.keras import layers, models

# Load MNIST data
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()

# Normalize data
x_train, x_test = x_train / 255.0, x_test / 255.0

# Build model
model = models.Sequential([
    layers.Flatten(input_shape=(28, 28)),
    layers.Dense(128, activation='relu'),
    layers.Dense(10, activation='softmax')
])

# Compile model with Adam optimizer and adjusted learning rate
optimizer = tf.keras.optimizers.Adam(learning_rate=0.001)
model.compile(optimizer=optimizer,
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

# Train model
history = model.fit(x_train, y_train, epochs=10, batch_size=64, validation_split=0.2, verbose=0)

# Evaluate on test data
test_loss, test_accuracy = model.evaluate(x_test, y_test, verbose=0)

print(f"Test accuracy: {test_accuracy*100:.2f}%, Test loss: {test_loss:.4f}")

Replaced SGD optimizer with Adam optimizer.

Set learning rate to 0.001 for Adam.

Kept model architecture unchanged.

Used validation split to monitor validation accuracy.

Results Interpretation

Before: Training accuracy: 98%, Validation accuracy: 85%, Training loss: 0.05, Validation loss: 0.45

After: Training accuracy: 93%, Validation accuracy: 91%, Training loss: 0.18, Validation loss: 0.25

Changing the optimizer from SGD to Adam with a suitable learning rate helped reduce overfitting by improving validation accuracy and balancing training accuracy. This shows how optimizer choice and tuning affect model generalization.

Bonus Experiment

Try using RMSprop optimizer with different learning rates and compare the validation accuracy and loss to Adam and SGD.

💡 Hint

Use learning rates like 0.001 and 0.0005 with RMSprop and observe how the model's validation accuracy changes.

Practice

(1/5)

1. Which optimizer in TensorFlow uses momentum to accelerate gradient descent and reduce oscillations?

easy

A. SGD with momentum

B. Adam

C. RMSprop

D. Adagrad

Optimizers (SGD, Adam, RMSprop) in TensorFlow - ML Experiment: Train & Evaluate

Start learning this pattern below

Practice

Solution

Step 1: Understand momentum in optimizers

Step 2: Identify optimizer using momentum

Final Answer:

Quick Check:

Solution

Step 1: Recall TensorFlow 2.x optimizer syntax

Step 2: Check correct Adam optimizer syntax

Final Answer:

Quick Check:

Solution

Step 1: Calculate initial prediction and loss

Step 2: Perform one RMSprop update step

Final Answer:

Quick Check:

Solution

Step 1: Check Adam optimizer argument requirements

Step 2: Identify error cause in code

Final Answer:

Quick Check:

Solution

Step 1: Understand optimizer strengths for noisy data

Step 2: Compare with other optimizers

Final Answer:

Quick Check: