What is recall in machine learning in python

MlopsConceptBeginner · 3 min read

Recall in Machine Learning with Python: Definition and Example

In machine learning, recall measures how well a model finds all the positive cases. It is the ratio of correctly predicted positive observations to all actual positives, showing the model's ability to catch positives. In Python, you can calculate recall using sklearn.metrics.recall_score.

⚙️

How It Works

Recall tells us how many of the actual positive cases our model correctly identified. Imagine you are a doctor testing for a disease. Recall answers the question: out of all the sick patients, how many did the test catch?

It is calculated as the number of true positives divided by the sum of true positives and false negatives. True positives are cases correctly found positive, and false negatives are positive cases the model missed.

High recall means the model misses very few positive cases, which is important when missing a positive is costly, like in medical diagnosis or fraud detection.

💻

Example

This example shows how to calculate recall in Python using sklearn. We create true labels and predicted labels, then compute recall score.

python

from sklearn.metrics import recall_score

# True labels (actual values)
y_true = [1, 0, 1, 1, 0, 1, 0, 0, 1, 0]
# Predicted labels from model
y_pred = [1, 0, 0, 1, 0, 1, 0, 1, 1, 0]

# Calculate recall
recall = recall_score(y_true, y_pred)
print(f"Recall: {recall:.2f}")

Output

Recall: 0.80

🎯

When to Use

Use recall when it is very important to catch all positive cases, even if it means some false alarms. For example:

Medical tests where missing a disease can be dangerous
Fraud detection to catch as many fraud cases as possible
Spam filters where you want to catch all spam emails

Recall helps balance the cost of missing positives versus false positives depending on the problem.

✅

Key Points

Recall measures how many actual positives the model correctly finds.
It is important when missing positives is costly.
Recall = True Positives / (True Positives + False Negatives).
Use sklearn.metrics.recall_score in Python to calculate it.

✅

Key Takeaways

Recall shows the model's ability to find all positive cases.

High recall means fewer missed positive cases.

Calculate recall in Python with sklearn's recall_score function.

Recall is crucial when missing positives has serious consequences.