0
0
MlopsHow-ToBeginner · 4 min read

How to Plot ROC Curve in Python with sklearn

To plot a ROC curve in Python, use roc_curve from sklearn.metrics to get false positive rate and true positive rate, then plot them with matplotlib. You can also use RocCurveDisplay for a quick plot from sklearn 1.0+.
📐

Syntax

The main function to compute ROC curve points is roc_curve(y_true, y_score), where y_true are true binary labels and y_score are predicted scores or probabilities. It returns false positive rate (FPR), true positive rate (TPR), and thresholds.

To plot, use matplotlib.pyplot.plot(FPR, TPR). Alternatively, RocCurveDisplay.from_estimator(estimator, X, y) plots ROC directly from a fitted model.

python
from sklearn.metrics import roc_curve
import matplotlib.pyplot as plt

fpr, tpr, thresholds = roc_curve(y_true, y_score)
plt.plot(fpr, tpr)
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('ROC Curve')
plt.show()
💻

Example

This example shows how to train a logistic regression model, predict probabilities, compute ROC curve points, and plot the ROC curve with matplotlib.

python
from sklearn.datasets import make_classification
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import roc_curve, roc_auc_score
import matplotlib.pyplot as plt

# Create a binary classification dataset
X, y = make_classification(n_samples=1000, n_classes=2, random_state=42)

# Train logistic regression
model = LogisticRegression(solver='liblinear')
model.fit(X, y)

# Predict probabilities for positive class
y_score = model.predict_proba(X)[:, 1]

# Compute ROC curve
fpr, tpr, thresholds = roc_curve(y, y_score)

# Compute AUC score
auc_score = roc_auc_score(y, y_score)

# Plot ROC curve
plt.plot(fpr, tpr, label=f'ROC curve (AUC = {auc_score:.2f})')
plt.plot([0, 1], [0, 1], 'k--')  # Diagonal line
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('ROC Curve Example')
plt.legend(loc='lower right')
plt.show()
Output
A plot window showing the ROC curve with a curve line above the diagonal and a legend 'ROC curve (AUC = 0.95)' approximately.
⚠️

Common Pitfalls

  • Using predicted class labels instead of predicted probabilities for roc_curve causes incorrect ROC curve.
  • Not using the positive class probabilities (usually index 1) from predict_proba.
  • Confusing ROC curve with precision-recall curve.
  • Plotting without the diagonal baseline line for reference.
python
from sklearn.metrics import roc_curve
import matplotlib.pyplot as plt

# Wrong: using predicted classes
# y_pred = model.predict(X)
# fpr, tpr, _ = roc_curve(y, y_pred)  # Incorrect

# Right: use predicted probabilities for positive class
# y_score = model.predict_proba(X)[:, 1]
# fpr, tpr, _ = roc_curve(y, y_score)  # Correct

# Plotting
# plt.plot(fpr, tpr)
# plt.plot([0, 1], [0, 1], 'k--')
# plt.show()
📊

Quick Reference

ROC Curve Quick Tips:

  • Use roc_curve(y_true, y_score) to get FPR and TPR.
  • Plot FPR on X-axis and TPR on Y-axis.
  • Use predicted probabilities, not class labels.
  • Use roc_auc_score(y_true, y_score) to get AUC metric.
  • For quick plotting, use RocCurveDisplay.from_estimator(model, X, y) (sklearn 1.0+).

Key Takeaways

Always use predicted probabilities, not class labels, to compute ROC curve.
Plot false positive rate on X-axis and true positive rate on Y-axis.
Use roc_auc_score to measure overall model performance with AUC.
RocCurveDisplay.from_estimator offers a quick way to plot ROC curve from a model.
Include the diagonal baseline line to compare model performance visually.