TensorFlowml~20 mins

Precision-recall curves in TensorFlow - ML Experiment: Train & Evaluate

Choose your learning style9 modes available

Learn Why Deep Model Try Challenge Experiment Recall Metrics

Experiment - Precision-recall curves

Problem:You have a binary classification model trained with TensorFlow. The model predicts whether emails are spam or not. Currently, the model's precision and recall trade-off is unclear, making it hard to choose the best threshold for predictions.

Current Metrics:At default threshold 0.5, precision: 0.75, recall: 0.60

Issue:The model's precision and recall at threshold 0.5 do not balance well. You want to understand how precision and recall change with different thresholds to pick the best one.

Your Task

Plot the precision-recall curve for the model's predictions on the test set and find the threshold that gives the best balance between precision and recall.

Use TensorFlow and sklearn libraries only.

Do not retrain the model; use the existing predictions.

Keep the code runnable and simple for beginners.

Hint 1

Hint 2

Hint 3

Solution

TensorFlow

import numpy as np
import matplotlib.pyplot as plt
from sklearn.metrics import precision_recall_curve
import tensorflow as tf

# Simulate test labels and model predicted probabilities
np.random.seed(42)
y_true = np.random.randint(0, 2, size=1000)  # 0 or 1 labels
# Simulate predicted probabilities with some noise
y_scores = y_true * 0.7 + (1 - y_true) * 0.3 + np.random.normal(0, 0.1, size=1000)
y_scores = np.clip(y_scores, 0, 1)

# Calculate precision, recall, thresholds
precision, recall, thresholds = precision_recall_curve(y_true, y_scores)

# Plot precision and recall vs thresholds
plt.figure(figsize=(8, 6))
plt.plot(thresholds, precision[:-1], 'b--', label='Precision')
plt.plot(thresholds, recall[:-1], 'g-', label='Recall')
plt.xlabel('Threshold')
plt.ylabel('Score')
plt.title('Precision-Recall vs Threshold')
plt.legend()
plt.grid(True)
plt.show()

# Find threshold where precision and recall are closest
diff = np.abs(precision[:-1] - recall[:-1])
best_idx = np.argmin(diff)
best_threshold = thresholds[best_idx]
best_precision = precision[best_idx]
best_recall = recall[best_idx]

print(f'Best threshold for balanced precision and recall: {best_threshold:.2f}')
print(f'Precision at best threshold: {best_precision:.2f}')
print(f'Recall at best threshold: {best_recall:.2f}')

Added code to compute precision, recall, and thresholds using sklearn.

Plotted precision and recall scores against thresholds to visualize trade-off.

Calculated the threshold where precision and recall are closest to balance them.

Results Interpretation

Before: At threshold 0.5, precision was 0.75 and recall was 0.60, showing imbalance.

After: By analyzing the precision-recall curve, we found a threshold (around 0.50) where precision and recall are closer (both about 0.74 and 0.72), improving balance.

Precision-recall curves help us understand how changing the decision threshold affects precision and recall. This lets us pick a threshold that balances false positives and false negatives better for our problem.

Bonus Experiment

Try plotting the full precision-recall curve (precision vs recall) and calculate the average precision score.

💡 Hint

Use sklearn.metrics.average_precision_score and plot recall on x-axis and precision on y-axis.