Bird
Raised Fist0
TensorFlowml~20 mins

Loss functions (MSE, cross-entropy) in TensorFlow - ML Experiment: Train & Evaluate

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Experiment - Loss functions (MSE, cross-entropy)
Problem:You are training a neural network to classify images into two categories. The current model uses Mean Squared Error (MSE) as the loss function.
Current Metrics:Training loss: 0.15, Training accuracy: 85%, Validation loss: 0.20, Validation accuracy: 80%
Issue:The model's validation accuracy is lower than expected for a classification task. Using MSE loss for classification can cause slower learning and less accurate predictions.
Your Task
Improve validation accuracy by switching the loss function from MSE to binary cross-entropy, which is better suited for classification problems.
Keep the model architecture the same.
Only change the loss function and retrain the model.
Use TensorFlow and Keras APIs.
Hint 1
Hint 2
Hint 3
Solution
TensorFlow
import tensorflow as tf
from tensorflow.keras import layers, models

# Load example dataset (binary classification)
(X_train, y_train), (X_val, y_val) = tf.keras.datasets.mnist.load_data()

# Preprocess data: keep only digits 0 and 1 for binary classification
train_filter = (y_train == 0) | (y_train == 1)
val_filter = (y_val == 0) | (y_val == 1)
X_train, y_train = X_train[train_filter], y_train[train_filter]
X_val, y_val = X_val[val_filter], y_val[val_filter]

# Normalize pixel values
X_train = X_train.astype('float32') / 255.0
X_val = X_val.astype('float32') / 255.0

# Flatten images
X_train = X_train.reshape(-1, 28*28)
X_val = X_val.reshape(-1, 28*28)

# Build simple model
model = models.Sequential([
    layers.Dense(64, activation='relu', input_shape=(28*28,)),
    layers.Dense(1, activation='sigmoid')
])

# Compile model with binary cross-entropy loss
model.compile(optimizer='adam',
              loss='binary_crossentropy',
              metrics=['accuracy'])

# Train model
history = model.fit(X_train, y_train, epochs=10, batch_size=32, validation_data=(X_val, y_val))
Changed loss function from Mean Squared Error (MSE) to binary cross-entropy.
Kept the model architecture unchanged.
Used sigmoid activation in the output layer for binary classification.
Results Interpretation

Before: Training accuracy 85%, Validation accuracy 80%, Loss function: MSE

After: Training accuracy ~98%, Validation accuracy ~97%, Loss function: Binary Cross-Entropy

Using the correct loss function for the task is crucial. Binary cross-entropy is better for classification than MSE, leading to faster learning and higher accuracy.
Bonus Experiment
Try using categorical cross-entropy loss with a model that classifies digits 0 to 9 (10 classes).
💡 Hint
Change the output layer to have 10 units with softmax activation and use 'sparse_categorical_crossentropy' as the loss.

Practice

(1/5)
1. Which loss function is best suited for predicting continuous numbers in TensorFlow?
easy
A. Mean Squared Error (MSE)
B. Categorical Cross-Entropy
C. Binary Cross-Entropy
D. Hinge Loss

Solution

  1. Step 1: Understand the type of prediction

    Continuous number prediction means the output is a real number, not categories.
  2. Step 2: Match loss function to prediction type

    MSE calculates the average squared difference between predicted and true numbers, ideal for continuous values.
  3. Final Answer:

    Mean Squared Error (MSE) -> Option A
  4. Quick Check:

    Continuous output = MSE [OK]
Hint: Use MSE for numbers, cross-entropy for categories [OK]
Common Mistakes:
  • Using cross-entropy for number prediction
  • Confusing binary and categorical cross-entropy
  • Choosing hinge loss for regression
2. Which of the following is the correct way to use Mean Squared Error loss in TensorFlow?
easy
A. tf.keras.losses.BinaryCrossentropy()
B. tf.losses.CrossEntropy()
C. tf.keras.losses.MeanSquaredError()
D. tf.losses.MSE()

Solution

  1. Step 1: Recall TensorFlow loss function syntax

    TensorFlow uses tf.keras.losses.MeanSquaredError() for MSE loss.
  2. Step 2: Check options for correct function name and module

    tf.keras.losses.MeanSquaredError() matches the correct full name and module; others are either wrong names or modules.
  3. Final Answer:

    tf.keras.losses.MeanSquaredError() -> Option C
  4. Quick Check:

    Correct MSE syntax = tf.keras.losses.MeanSquaredError() [OK]
Hint: Use tf.keras.losses for standard loss functions [OK]
Common Mistakes:
  • Using tf.losses instead of tf.keras.losses
  • Wrong function names like CrossEntropy for MSE
  • Missing parentheses when creating loss object
3. What will be the output loss value when using Mean Squared Error loss in TensorFlow for predictions [2.0, 3.0] and true values [1.0, 5.0]?
medium
A. 1.5
B. 3.0
C. 4.0
D. 2.5

Solution

  1. Step 1: Calculate squared errors for each prediction

    (2.0 - 1.0)^2 = 1.0, (3.0 - 5.0)^2 = 4.0
  2. Step 2: Compute mean of squared errors

    (1.0 + 4.0) / 2 = 2.5
  3. Step 3: Verify options

    2.5 matches 2.5, but check carefully: The question asks for output loss value from TensorFlow's MSE which returns mean, so 2.5 is correct.
  4. Final Answer:

    2.5 -> Option D
  5. Quick Check:

    MSE = mean squared error = 2.5 [OK]
Hint: Square errors, then average them for MSE [OK]
Common Mistakes:
  • Summing errors without averaging
  • Taking absolute difference instead of squared
  • Mixing up predicted and true values
4. Identify the error in this TensorFlow code snippet using categorical cross-entropy loss:
model.compile(optimizer='adam', loss=tf.keras.losses.CategoricalCrossentropy, metrics=['accuracy'])
medium
A. Missing parentheses after CategoricalCrossentropy
B. Wrong optimizer name
C. Metrics should be 'loss' not 'accuracy'
D. Loss function should be a string, not an object

Solution

  1. Step 1: Check loss function usage in compile

    Loss functions must be called as objects, so parentheses are needed.
  2. Step 2: Identify missing parentheses

    tf.keras.losses.CategoricalCrossentropy is a class; missing () means passing the class, not an instance.
  3. Final Answer:

    Missing parentheses after CategoricalCrossentropy -> Option A
  4. Quick Check:

    Loss function needs () to create instance [OK]
Hint: Always add () when passing loss function classes [OK]
Common Mistakes:
  • Forgetting parentheses on loss functions
  • Confusing optimizer names
  • Using wrong metric names
5. You have a multi-class classification problem with 4 classes. Which loss function and output layer activation should you use in TensorFlow for best results?
hard
A. Use Mean Squared Error loss with sigmoid activation
B. Use Categorical Cross-Entropy loss with softmax activation
C. Use Binary Cross-Entropy loss with softmax activation
D. Use Hinge loss with linear activation

Solution

  1. Step 1: Identify problem type and output requirements

    Multi-class classification with 4 classes requires probabilities summing to 1.
  2. Step 2: Match loss and activation functions

    Softmax activation outputs probabilities for each class; categorical cross-entropy measures loss for multi-class.
  3. Final Answer:

    Use Categorical Cross-Entropy loss with softmax activation -> Option B
  4. Quick Check:

    Multi-class = softmax + categorical cross-entropy [OK]
Hint: Softmax + categorical cross-entropy for multi-class [OK]
Common Mistakes:
  • Using MSE for classification
  • Using sigmoid for multi-class output
  • Using binary cross-entropy for multi-class