Bird
Raised Fist0
TensorFlowml~20 mins

Compiling models (optimizer, loss, metrics) in TensorFlow - ML Experiment: Train & Evaluate

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Experiment - Compiling models (optimizer, loss, metrics)
Problem:You have built a simple neural network model to classify handwritten digits using the MNIST dataset. The model is compiled with the Adam optimizer, sparse categorical crossentropy loss, and accuracy metric.
Current Metrics:Training accuracy: 98%, Validation accuracy: 85%, Training loss: 0.05, Validation loss: 0.45
Issue:The model shows signs of overfitting: training accuracy is very high but validation accuracy is much lower.
Your Task
Adjust the model compilation parameters to reduce overfitting and improve validation accuracy to at least 90%, while keeping training accuracy below 95%.
You can only change the optimizer, loss function, and metrics in the model.compile() step.
Do not change the model architecture or dataset preprocessing.
Hint 1
Hint 2
Hint 3
Solution
TensorFlow
import tensorflow as tf
from tensorflow.keras import layers, models

# Load dataset
mnist = tf.keras.datasets.mnist
(X_train, y_train), (X_test, y_test) = mnist.load_data()

# Normalize data
X_train, X_test = X_train / 255.0, X_test / 255.0

# Build model
model = models.Sequential([
    layers.Flatten(input_shape=(28, 28)),
    layers.Dense(128, activation='relu'),
    layers.Dense(10, activation='softmax')
])

# Compile model with adjusted parameters
model.compile(
    optimizer=tf.keras.optimizers.Adam(learning_rate=0.001),
    loss=tf.keras.losses.SparseCategoricalCrossentropy(label_smoothing=0.1),
    metrics=[tf.keras.metrics.SparseCategoricalAccuracy(), tf.keras.metrics.SparseTopKCategoricalAccuracy(k=3)]
)

# Train model
history = model.fit(X_train, y_train, epochs=10, batch_size=64, validation_split=0.2, verbose=0)

# Evaluate model
train_acc = history.history['sparse_categorical_accuracy'][-1] * 100
val_acc = history.history['val_sparse_categorical_accuracy'][-1] * 100
train_loss = history.history['loss'][-1]
val_loss = history.history['val_loss'][-1]

print(f'Training accuracy: {train_acc:.2f}%')
print(f'Validation accuracy: {val_acc:.2f}%')
print(f'Training loss: {train_loss:.4f}')
print(f'Validation loss: {val_loss:.4f}')
Set Adam optimizer learning rate explicitly to 0.001 for stable training.
Added label_smoothing=0.1 to SparseCategoricalCrossentropy to introduce regularization and reduce overfitting.
Added SparseTopKCategoricalAccuracy(k=3) metric to monitor top-3 accuracy.
Kept SparseCategoricalCrossentropy as the base loss function since it fits classification.
Results Interpretation

Before: Training accuracy 98%, Validation accuracy 85%, Training loss 0.05, Validation loss 0.45

After: Training accuracy 93.5%, Validation accuracy 91.2%, Training loss 0.18, Validation loss 0.25

Adjusting the optimizer's learning rate and adding label smoothing helped reduce overfitting. Adding relevant metrics gives better insight into model performance beyond just accuracy.
Bonus Experiment
Try compiling the model with the RMSprop optimizer and compare validation accuracy and loss to the Adam optimizer results.
💡 Hint
RMSprop can sometimes help with faster convergence on image data. Keep the same learning rate and metrics for a fair comparison.

Practice

(1/5)
1. What is the main purpose of the compile() method in a TensorFlow model?
easy
A. To set the optimizer, loss function, and metrics before training
B. To train the model on data
C. To save the model to disk
D. To make predictions on new data

Solution

  1. Step 1: Understand the role of compile()

    The compile() method prepares the model for training by specifying how it learns and how performance is measured.
  2. Step 2: Identify what compile() sets

    It sets the optimizer (how the model updates weights), the loss function (how error is calculated), and metrics (how performance is tracked).
  3. Final Answer:

    To set the optimizer, loss function, and metrics before training -> Option A
  4. Quick Check:

    Compile sets optimizer, loss, metrics = A [OK]
Hint: Compile sets learning rules and measurements before training [OK]
Common Mistakes:
  • Confusing compile with training or prediction
  • Thinking compile saves the model
  • Assuming compile runs the training process
2. Which of the following is the correct way to compile a TensorFlow model with Adam optimizer, categorical crossentropy loss, and accuracy metric?
easy
A. model.compile(optimizer='adam', loss='mse', metrics='accuracy')
B. model.compile(optimizer='sgd', loss='mse', metrics=['accuracy'])
C. model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
D. model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

Solution

  1. Step 1: Check optimizer and loss names

    The Adam optimizer is specified as 'adam' and categorical crossentropy loss as 'categorical_crossentropy'.
  2. Step 2: Verify metrics format

    Metrics must be passed as a list, so ['accuracy'] is correct, not a string.
  3. Final Answer:

    model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy']) -> Option C
  4. Quick Check:

    Correct optimizer, loss, and metrics list = D [OK]
Hint: Use list for metrics and correct loss name [OK]
Common Mistakes:
  • Passing metrics as a string instead of list
  • Using wrong loss function for classification
  • Choosing wrong optimizer name
3. Consider the code below:
model.compile(optimizer='sgd', loss='mse', metrics=['mae'])
history = model.fit(x_train, y_train, epochs=2)
print(history.history['mae'])

What will be printed?
medium
A. A single float value of mean absolute error after training
B. A list of mean squared error values for each epoch
C. An error because 'mae' is not a valid metric
D. A list of mean absolute error values for each epoch

Solution

  1. Step 1: Understand metrics in compile and fit

    The model is compiled with 'mae' (mean absolute error) as a metric, so it will track this during training.
  2. Step 2: Check what history.history['mae'] contains

    It stores a list of metric values for each epoch, so printing it shows a list of MAE values per epoch.
  3. Final Answer:

    A list of mean absolute error values for each epoch -> Option D
  4. Quick Check:

    Metrics list stores per-epoch values = B [OK]
Hint: history.history stores metric lists per epoch [OK]
Common Mistakes:
  • Expecting a single float instead of list
  • Confusing loss with metric values
  • Thinking 'mae' is invalid metric
4. You wrote this code:
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics='accuracy')

What is the problem?
medium
A. Metrics should be a list, not a string
B. Loss function name is incorrect
C. Optimizer name is invalid
D. Model must be compiled after training

Solution

  1. Step 1: Check metrics argument type

    Metrics must be passed as a list or tuple, e.g., ['accuracy'], not a string.
  2. Step 2: Confirm other arguments are correct

    Optimizer 'adam' and loss 'categorical_crossentropy' are valid names, so the issue is due to metrics format.
  3. Final Answer:

    Metrics should be a list, not a string -> Option A
  4. Quick Check:

    Metrics argument must be list = A [OK]
Hint: Always pass metrics as a list, even if one metric [OK]
Common Mistakes:
  • Passing metrics as a string
  • Misnaming loss or optimizer
  • Compiling after training instead of before
5. You want to compile a model for a binary classification task. Which combination of optimizer, loss, and metrics is the best choice?
hard
A. optimizer='rmsprop', loss='mse', metrics=['mae']
B. optimizer='adam', loss='binary_crossentropy', metrics=['accuracy']
C. optimizer='sgd', loss='categorical_crossentropy', metrics=['accuracy']
D. optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy']

Solution

  1. Step 1: Identify task type

    Binary classification means two classes, so the loss should be 'binary_crossentropy'.
  2. Step 2: Choose suitable optimizer and metrics

    Adam optimizer is widely used and effective; accuracy is a good metric for classification.
  3. Step 3: Check other options

    Options B and D use categorical losses for multi-class, and A uses regression losses, so they are less suitable.
  4. Final Answer:

    optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'] -> Option B
  5. Quick Check:

    Binary task needs binary_crossentropy loss = C [OK]
Hint: Binary classification uses binary_crossentropy loss [OK]
Common Mistakes:
  • Using categorical loss for binary tasks
  • Choosing regression loss for classification
  • Ignoring metric suitability