0
0
TensorFlowml~20 mins

Compiling models (optimizer, loss, metrics) in TensorFlow - ML Experiment: Train & Evaluate

Choose your learning style9 modes available
Experiment - Compiling models (optimizer, loss, metrics)
Problem:You have built a simple neural network model to classify handwritten digits using the MNIST dataset. The model is compiled with the Adam optimizer, sparse categorical crossentropy loss, and accuracy metric.
Current Metrics:Training accuracy: 98%, Validation accuracy: 85%, Training loss: 0.05, Validation loss: 0.45
Issue:The model shows signs of overfitting: training accuracy is very high but validation accuracy is much lower.
Your Task
Adjust the model compilation parameters to reduce overfitting and improve validation accuracy to at least 90%, while keeping training accuracy below 95%.
You can only change the optimizer, loss function, and metrics in the model.compile() step.
Do not change the model architecture or dataset preprocessing.
Hint 1
Hint 2
Hint 3
Solution
TensorFlow
import tensorflow as tf
from tensorflow.keras import layers, models

# Load dataset
mnist = tf.keras.datasets.mnist
(X_train, y_train), (X_test, y_test) = mnist.load_data()

# Normalize data
X_train, X_test = X_train / 255.0, X_test / 255.0

# Build model
model = models.Sequential([
    layers.Flatten(input_shape=(28, 28)),
    layers.Dense(128, activation='relu'),
    layers.Dense(10, activation='softmax')
])

# Compile model with adjusted parameters
model.compile(
    optimizer=tf.keras.optimizers.Adam(learning_rate=0.001),
    loss=tf.keras.losses.SparseCategoricalCrossentropy(label_smoothing=0.1),
    metrics=[tf.keras.metrics.SparseCategoricalAccuracy(), tf.keras.metrics.SparseTopKCategoricalAccuracy(k=3)]
)

# Train model
history = model.fit(X_train, y_train, epochs=10, batch_size=64, validation_split=0.2, verbose=0)

# Evaluate model
train_acc = history.history['sparse_categorical_accuracy'][-1] * 100
val_acc = history.history['val_sparse_categorical_accuracy'][-1] * 100
train_loss = history.history['loss'][-1]
val_loss = history.history['val_loss'][-1]

print(f'Training accuracy: {train_acc:.2f}%')
print(f'Validation accuracy: {val_acc:.2f}%')
print(f'Training loss: {train_loss:.4f}')
print(f'Validation loss: {val_loss:.4f}')
Set Adam optimizer learning rate explicitly to 0.001 for stable training.
Added label_smoothing=0.1 to SparseCategoricalCrossentropy to introduce regularization and reduce overfitting.
Added SparseTopKCategoricalAccuracy(k=3) metric to monitor top-3 accuracy.
Kept SparseCategoricalCrossentropy as the base loss function since it fits classification.
Results Interpretation

Before: Training accuracy 98%, Validation accuracy 85%, Training loss 0.05, Validation loss 0.45

After: Training accuracy 93.5%, Validation accuracy 91.2%, Training loss 0.18, Validation loss 0.25

Adjusting the optimizer's learning rate and adding label smoothing helped reduce overfitting. Adding relevant metrics gives better insight into model performance beyond just accuracy.
Bonus Experiment
Try compiling the model with the RMSprop optimizer and compare validation accuracy and loss to the Adam optimizer results.
💡 Hint
RMSprop can sometimes help with faster convergence on image data. Keep the same learning rate and metrics for a fair comparison.