How to Plot ROC Curve in Python with sklearn
To plot a ROC curve in Python, use
roc_curve from sklearn.metrics to get false positive rate and true positive rate, then plot them with matplotlib. You can also use RocCurveDisplay for a quick plot from sklearn 1.0+.Syntax
The main function to compute ROC curve points is roc_curve(y_true, y_score), where y_true are true binary labels and y_score are predicted scores or probabilities. It returns false positive rate (FPR), true positive rate (TPR), and thresholds.
To plot, use matplotlib.pyplot.plot(FPR, TPR). Alternatively, RocCurveDisplay.from_estimator(estimator, X, y) plots ROC directly from a fitted model.
python
from sklearn.metrics import roc_curve import matplotlib.pyplot as plt fpr, tpr, thresholds = roc_curve(y_true, y_score) plt.plot(fpr, tpr) plt.xlabel('False Positive Rate') plt.ylabel('True Positive Rate') plt.title('ROC Curve') plt.show()
Example
This example shows how to train a logistic regression model, predict probabilities, compute ROC curve points, and plot the ROC curve with matplotlib.
python
from sklearn.datasets import make_classification from sklearn.linear_model import LogisticRegression from sklearn.metrics import roc_curve, roc_auc_score import matplotlib.pyplot as plt # Create a binary classification dataset X, y = make_classification(n_samples=1000, n_classes=2, random_state=42) # Train logistic regression model = LogisticRegression(solver='liblinear') model.fit(X, y) # Predict probabilities for positive class y_score = model.predict_proba(X)[:, 1] # Compute ROC curve fpr, tpr, thresholds = roc_curve(y, y_score) # Compute AUC score auc_score = roc_auc_score(y, y_score) # Plot ROC curve plt.plot(fpr, tpr, label=f'ROC curve (AUC = {auc_score:.2f})') plt.plot([0, 1], [0, 1], 'k--') # Diagonal line plt.xlabel('False Positive Rate') plt.ylabel('True Positive Rate') plt.title('ROC Curve Example') plt.legend(loc='lower right') plt.show()
Output
A plot window showing the ROC curve with a curve line above the diagonal and a legend 'ROC curve (AUC = 0.95)' approximately.
Common Pitfalls
- Using predicted class labels instead of predicted probabilities for
roc_curvecauses incorrect ROC curve. - Not using the positive class probabilities (usually index 1) from
predict_proba. - Confusing ROC curve with precision-recall curve.
- Plotting without the diagonal baseline line for reference.
python
from sklearn.metrics import roc_curve import matplotlib.pyplot as plt # Wrong: using predicted classes # y_pred = model.predict(X) # fpr, tpr, _ = roc_curve(y, y_pred) # Incorrect # Right: use predicted probabilities for positive class # y_score = model.predict_proba(X)[:, 1] # fpr, tpr, _ = roc_curve(y, y_score) # Correct # Plotting # plt.plot(fpr, tpr) # plt.plot([0, 1], [0, 1], 'k--') # plt.show()
Quick Reference
ROC Curve Quick Tips:
- Use
roc_curve(y_true, y_score)to get FPR and TPR. - Plot FPR on X-axis and TPR on Y-axis.
- Use predicted probabilities, not class labels.
- Use
roc_auc_score(y_true, y_score)to get AUC metric. - For quick plotting, use
RocCurveDisplay.from_estimator(model, X, y)(sklearn 1.0+).
Key Takeaways
Always use predicted probabilities, not class labels, to compute ROC curve.
Plot false positive rate on X-axis and true positive rate on Y-axis.
Use roc_auc_score to measure overall model performance with AUC.
RocCurveDisplay.from_estimator offers a quick way to plot ROC curve from a model.
Include the diagonal baseline line to compare model performance visually.