TensorFlowml~20 mins

Loss functions (MSE, cross-entropy) in TensorFlow - ML Experiment: Train & Evaluate

Choose your learning style10 modes available

Learn Why Deep Model Try Challenge Experiment Recall Metrics

Start learning this pattern below

Jump into concepts and practice - no test required

Recommended

Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong

Experiment - Loss functions (MSE, cross-entropy)

Problem:You are training a neural network to classify images into two categories. The current model uses Mean Squared Error (MSE) as the loss function.

Current Metrics:Training loss: 0.15, Training accuracy: 85%, Validation loss: 0.20, Validation accuracy: 80%

Issue:The model's validation accuracy is lower than expected for a classification task. Using MSE loss for classification can cause slower learning and less accurate predictions.

Your Task

Improve validation accuracy by switching the loss function from MSE to binary cross-entropy, which is better suited for classification problems.

Keep the model architecture the same.

Only change the loss function and retrain the model.

Use TensorFlow and Keras APIs.

Hint 1

Hint 2

Hint 3

Solution

TensorFlow

import tensorflow as tf
from tensorflow.keras import layers, models

# Load example dataset (binary classification)
(X_train, y_train), (X_val, y_val) = tf.keras.datasets.mnist.load_data()

# Preprocess data: keep only digits 0 and 1 for binary classification
train_filter = (y_train == 0) | (y_train == 1)
val_filter = (y_val == 0) | (y_val == 1)
X_train, y_train = X_train[train_filter], y_train[train_filter]
X_val, y_val = X_val[val_filter], y_val[val_filter]

# Normalize pixel values
X_train = X_train.astype('float32') / 255.0
X_val = X_val.astype('float32') / 255.0

# Flatten images
X_train = X_train.reshape(-1, 28*28)
X_val = X_val.reshape(-1, 28*28)

# Build simple model
model = models.Sequential([
    layers.Dense(64, activation='relu', input_shape=(28*28,)),
    layers.Dense(1, activation='sigmoid')
])

# Compile model with binary cross-entropy loss
model.compile(optimizer='adam',
              loss='binary_crossentropy',
              metrics=['accuracy'])

# Train model
history = model.fit(X_train, y_train, epochs=10, batch_size=32, validation_data=(X_val, y_val))

Changed loss function from Mean Squared Error (MSE) to binary cross-entropy.

Kept the model architecture unchanged.

Used sigmoid activation in the output layer for binary classification.

Results Interpretation

Before: Training accuracy 85%, Validation accuracy 80%, Loss function: MSE

After: Training accuracy ~98%, Validation accuracy ~97%, Loss function: Binary Cross-Entropy

Using the correct loss function for the task is crucial. Binary cross-entropy is better for classification than MSE, leading to faster learning and higher accuracy.

Bonus Experiment

Try using categorical cross-entropy loss with a model that classifies digits 0 to 9 (10 classes).

💡 Hint

Change the output layer to have 10 units with softmax activation and use 'sparse_categorical_crossentropy' as the loss.

Practice

(1/5)

1. Which loss function is best suited for predicting continuous numbers in TensorFlow?

easy

A. Mean Squared Error (MSE)

B. Categorical Cross-Entropy

C. Binary Cross-Entropy

D. Hinge Loss

Loss functions (MSE, cross-entropy) in TensorFlow - ML Experiment: Train & Evaluate

Start learning this pattern below

Practice

Solution

Step 1: Understand the type of prediction

Step 2: Match loss function to prediction type

Final Answer:

Quick Check:

Solution

Step 1: Recall TensorFlow loss function syntax

Step 2: Check options for correct function name and module

Final Answer:

Quick Check:

Solution

Step 1: Calculate squared errors for each prediction

Step 2: Compute mean of squared errors

Step 3: Verify options

Final Answer:

Quick Check:

Solution

Step 1: Check loss function usage in compile

Step 2: Identify missing parentheses

Final Answer:

Quick Check:

Solution

Step 1: Identify problem type and output requirements

Step 2: Match loss and activation functions

Final Answer:

Quick Check: