How to use confusion matrix sklearn in python

MlopsHow-ToBeginner · 3 min read

How to Use Confusion Matrix in sklearn with Python

Use confusion_matrix from sklearn.metrics by passing true labels and predicted labels as arguments. It returns a matrix showing counts of correct and incorrect predictions for each class.

📐

Syntax

The confusion_matrix function is used to compute the confusion matrix to evaluate classification accuracy.

y_true: The true labels of your data.
y_pred: The predicted labels from your model.
labels (optional): List of labels to index the matrix. Useful to control order.
normalize (optional): Can be None, 'true', 'pred', or 'all' to normalize the matrix.

python

from sklearn.metrics import confusion_matrix

cm = confusion_matrix(y_true, y_pred, labels=None, normalize=None)

💻

Example

This example shows how to create a confusion matrix for a simple classification problem with true and predicted labels.

python

from sklearn.metrics import confusion_matrix

# True labels
y_true = [0, 1, 2, 2, 0, 1]
# Predicted labels
y_pred = [0, 2, 2, 2, 0, 0]

cm = confusion_matrix(y_true, y_pred)
print(cm)

Output

[[2 0 0] [1 0 1] [0 0 2]]

⚠️

Common Pitfalls

Common mistakes when using confusion_matrix include:

Mixing up y_true and y_pred which leads to incorrect matrix interpretation.
Not specifying labels when your classes are not sorted or missing some labels, causing unexpected matrix shape.
Ignoring normalization when comparing models with different class distributions.

python

from sklearn.metrics import confusion_matrix

# Wrong: swapped arguments
cm_wrong = confusion_matrix(y_pred, y_true)
print('Wrong matrix:\n', cm_wrong)

# Right: correct order
cm_right = confusion_matrix(y_true, y_pred)
print('Right matrix:\n', cm_right)

Output

Wrong matrix: [[2 1 0] [0 0 0] [0 1 2]] Right matrix: [[2 0 0] [1 0 1] [0 0 2]]

📊

Quick Reference

Parameter	Description
y_true	Array of true class labels
y_pred	Array of predicted class labels
labels	List of labels to index matrix (optional)
normalize	'true', 'pred', 'all' or None for normalization

✅

Key Takeaways

Always pass true labels first, then predicted labels to confusion_matrix.

Use the labels parameter to control class order and include all classes.

Normalize the confusion matrix to compare models fairly across class imbalances.

Interpret the matrix rows as true classes and columns as predicted classes.

Confusion matrix helps identify types of classification errors clearly.