0
0
MlopsHow-ToBeginner · 3 min read

How to Use Confusion Matrix in sklearn with Python

Use confusion_matrix from sklearn.metrics by passing true labels and predicted labels as arguments. It returns a matrix showing counts of correct and incorrect predictions for each class.
📐

Syntax

The confusion_matrix function is used to compute the confusion matrix to evaluate classification accuracy.

  • y_true: The true labels of your data.
  • y_pred: The predicted labels from your model.
  • labels (optional): List of labels to index the matrix. Useful to control order.
  • normalize (optional): Can be None, 'true', 'pred', or 'all' to normalize the matrix.
python
from sklearn.metrics import confusion_matrix

cm = confusion_matrix(y_true, y_pred, labels=None, normalize=None)
💻

Example

This example shows how to create a confusion matrix for a simple classification problem with true and predicted labels.

python
from sklearn.metrics import confusion_matrix

# True labels
y_true = [0, 1, 2, 2, 0, 1]
# Predicted labels
y_pred = [0, 2, 2, 2, 0, 0]

cm = confusion_matrix(y_true, y_pred)
print(cm)
Output
[[2 0 0] [1 0 1] [0 0 2]]
⚠️

Common Pitfalls

Common mistakes when using confusion_matrix include:

  • Mixing up y_true and y_pred which leads to incorrect matrix interpretation.
  • Not specifying labels when your classes are not sorted or missing some labels, causing unexpected matrix shape.
  • Ignoring normalization when comparing models with different class distributions.
python
from sklearn.metrics import confusion_matrix

# Wrong: swapped arguments
cm_wrong = confusion_matrix(y_pred, y_true)
print('Wrong matrix:\n', cm_wrong)

# Right: correct order
cm_right = confusion_matrix(y_true, y_pred)
print('Right matrix:\n', cm_right)
Output
Wrong matrix: [[2 1 0] [0 0 0] [0 1 2]] Right matrix: [[2 0 0] [1 0 1] [0 0 2]]
📊

Quick Reference

ParameterDescription
y_trueArray of true class labels
y_predArray of predicted class labels
labelsList of labels to index matrix (optional)
normalize'true', 'pred', 'all' or None for normalization

Key Takeaways

Always pass true labels first, then predicted labels to confusion_matrix.
Use the labels parameter to control class order and include all classes.
Normalize the confusion matrix to compare models fairly across class imbalances.
Interpret the matrix rows as true classes and columns as predicted classes.
Confusion matrix helps identify types of classification errors clearly.