Precision vs recall difference in python

MlopsComparisonBeginner · 3 min read

Precision vs Recall in Python: Key Differences and Usage

In Python, precision measures how many predicted positives are actually correct, while recall measures how many actual positives were found by the model. Both are important for classification tasks and can be calculated using precision_score and recall_score from sklearn.

⚖️

Quick Comparison

Here is a quick side-by-side comparison of precision and recall to understand their key differences.

Factor	Precision	Recall
Definition	Ratio of true positives to all predicted positives	Ratio of true positives to all actual positives
Focus	Accuracy of positive predictions	Completeness of positive predictions
Formula	TP / (TP + FP)	TP / (TP + FN)
Use case	When false positives are costly	When missing positives is costly
Example	Spam filter avoiding false spam labels	Disease test catching all sick patients
Range	0 to 1 (higher is better)	0 to 1 (higher is better)

⚖️

Key Differences

Precision tells you how many of the items labeled as positive by your model are actually positive. It focuses on the quality of positive predictions, so if your model says "yes," precision checks if it is usually right.

Recall, on the other hand, tells you how many of the actual positive items your model managed to find. It focuses on coverage, so if there are 100 positive cases, recall checks how many your model caught.

In practical terms, precision is important when false alarms (false positives) are bad, like wrongly flagging emails as spam. Recall is important when missing a positive case (false negative) is bad, like failing to detect a disease. Both metrics are calculated using true positives (TP), false positives (FP), and false negatives (FN) but emphasize different errors.

⚖️

Code Comparison

Here is how to calculate precision in Python using sklearn on the same example data.

python

from sklearn.metrics import precision_score

y_true = [0, 1, 1, 0, 1, 0, 1, 1]
y_pred = [0, 0, 1, 0, 1, 1, 1, 0]

precision = precision_score(y_true, y_pred)
print(f"Precision: {precision:.2f}")

Output

Precision: 0.75

↔️

Recall Equivalent

Here is how to calculate recall in Python using sklearn on the same example data.

python

from sklearn.metrics import recall_score

y_true = [0, 1, 1, 0, 1, 0, 1, 1]
y_pred = [0, 0, 1, 0, 1, 1, 1, 0]

recall = recall_score(y_true, y_pred)
print(f"Recall: {recall:.2f}")

Output

Recall: 0.60

🎯

When to Use Which

Choose precision when you want to be very sure about positive predictions and avoid false alarms, such as in email spam detection or fraud detection.

Choose recall when it is more important to catch all positive cases, even if some false alarms happen, such as in medical diagnosis or safety monitoring.

Often, you balance both using metrics like the F1 score, but understanding the difference helps pick the right focus for your problem.

✅

Key Takeaways

Precision measures correctness of positive predictions; recall measures coverage of actual positives.

Use precision to reduce false positives; use recall to reduce false negatives.

Both metrics range from 0 to 1, where higher is better.

Calculate precision with sklearn's precision_score and recall with recall_score.

Choose the metric based on whether false positives or false negatives are more costly in your task.