When we want to see how well a model guesses categories, the confusion matrix helps us understand the details. It shows how many times the model got things right or wrong for each category. This helps us pick the right metric like accuracy, precision, or recall depending on what matters most for our problem.
Confusion matrix visualization in TensorFlow - Model Metrics & Evaluation
Start learning this pattern below
Jump into concepts and practice - no test required
Predicted
0 1
Actual 0 | 50 | 10 |
1 | 5 | 35 |
Where:
- 50 = True Negative (TN)
- 10 = False Positive (FP)
- 5 = False Negative (FN)
- 35 = True Positive (TP)
Total samples = 50 + 10 + 5 + 35 = 100This table shows how many times the model guessed each class correctly or incorrectly.
Precision tells us how many of the items the model said were positive actually are positive. For example, in spam detection, high precision means few good emails are wrongly marked as spam.
Recall tells us how many of the actual positive items the model found. For example, in cancer detection, high recall means the model finds most cancer cases, even if it sometimes makes mistakes.
Improving one often lowers the other, so we choose based on what is more important: avoiding false alarms or missing real cases.
For the confusion matrix above:
- Precision = TP / (TP + FP) = 35 / (35 + 10) = 0.78 (78%)
- Recall = TP / (TP + FN) = 35 / (35 + 5) = 0.88 (88%)
- Accuracy = (TP + TN) / Total = (35 + 50) / 100 = 0.85 (85%)
Good: Precision and recall above 80% means the model is reliable in both finding positives and not making many mistakes.
Bad: Precision or recall below 50% means the model often makes wrong guesses or misses many positives.
- Accuracy paradox: High accuracy can be misleading if classes are unbalanced. For example, if 95% of data is negative, a model that always guesses negative has 95% accuracy but is useless.
- Data leakage: When the model accidentally learns from future or test data, metrics look better but the model fails in real use.
- Overfitting: Very high training metrics but poor test metrics mean the model memorizes training data and won't generalize.
Your model has 98% accuracy but only 12% recall on fraud cases. Is it good for production?
Answer: No. The model misses 88% of fraud cases (low recall), which is dangerous. Even with high accuracy, it fails to catch most frauds, so it is not good for production.
Practice
Solution
Step 1: Understand the purpose of a confusion matrix
A confusion matrix is a table used to describe the performance of a classification model by showing correct and incorrect predictions for each class.Step 2: Match the description to the options
The description 'How many times each class was predicted correctly or wrongly' matches the purpose of a confusion matrix.Final Answer:
How many times each class was predicted correctly or wrongly -> Option DQuick Check:
Confusion matrix = correct and wrong predictions [OK]
- Confusing confusion matrix with training speed
- Thinking it shows model architecture details
- Assuming it shows dataset size
Solution
Step 1: Identify TensorFlow functions related to confusion matrix
The function to create a confusion matrix is specifically designed to compare true and predicted labels.Step 2: Match the function to the options
tf.math.confusion_matrix is the correct TensorFlow function for this purpose, while others relate to layers, datasets, or image processing.Final Answer:
tf.math.confusion_matrix -> Option CQuick Check:
Confusion matrix function = tf.math.confusion_matrix [OK]
- Choosing layer or dataset functions instead
- Confusing with image processing functions
- Using non-existent TensorFlow functions
import tensorflow as tf true_labels = [0, 1, 2, 2, 0] pred_labels = [0, 2, 2, 2, 0] cm = tf.math.confusion_matrix(true_labels, pred_labels) print(cm.numpy())
Solution
Step 1: Count true vs predicted labels
For class 0: true labels are at positions 0 and 4, predicted also 0 both times -> 2 correct.
For class 1: true label at position 1, predicted is 2 -> 0 correct, 1 predicted as 2.
For class 2: true labels at positions 2 and 3, predicted both 2 -> 2 correct.Step 2: Build confusion matrix rows
Row 0 (true 0): predicted 0 twice -> [2,0,0]
Row 1 (true 1): predicted 2 once -> [0,0,1]
Row 2 (true 2): predicted 2 twice -> [0,0,2]Final Answer:
[[2 0 0] [0 0 1] [0 0 2]] -> Option AQuick Check:
Count true vs predicted labels = [[2 0 0] [0 0 1] [0 0 2]] [OK]
- Mixing up true and predicted label order
- Counting predicted labels as rows
- Miscounting class occurrences
import tensorflow as tf true_labels = [0, 1, 1, 0] pred_labels = [0, 1, 0, 0] cm = tf.math.confusion_matrix(true_labels, pred_labels, num_classes=1) print(cm.numpy())
Solution
Step 1: Check the number of classes in labels
True and predicted labels only contain 0 and 1, so there are 2 classes total.Step 2: Verify num_classes argument
Setting num_classes=1 is incorrect because labels include 1, which is not in [0, 1), causing a ValueError (labels out of range).Final Answer:
num_classes should be 2, not 1 -> Option BQuick Check:
num_classes must match actual classes = 2 [OK]
- Using wrong num_classes value
- Thinking lists are invalid inputs
- Misunderstanding print method for tensors
Solution
Step 1: Generate confusion matrix using TensorFlow
tf.math.confusion_matrix(true, pred) correctly creates the confusion matrix tensor.Step 2: Visualize matrix using Matplotlib heatmap
plt.imshow with cmap='Blues' displays the matrix as a heatmap, plt.colorbar adds a color scale, and plt.show() renders the plot.Final Answer:
Code snippet B correctly creates and displays the heatmap -> Option AQuick Check:
Use tf.math.confusion_matrix + plt.imshow + plt.colorbar [OK]
- Using tf.keras.metrics.ConfusionMatrix (does not exist)
- Plotting confusion matrix with plt.plot or plt.bar
- Forgetting to add colorbar for scale
