We use classification evaluation to check how well a model guesses categories. It helps us know if the model is good or needs improvement.
Classification evaluation (accuracy, precision, recall, F1) in ML Python
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score accuracy = accuracy_score(true_labels, predicted_labels) precision = precision_score(true_labels, predicted_labels) recall = recall_score(true_labels, predicted_labels) f1 = f1_score(true_labels, predicted_labels)
accuracy_score measures overall correct guesses.
precision_score measures how many predicted positives are actually positive.
recall_score measures how many actual positives were found.
f1_score balances precision and recall into one number.
accuracy = accuracy_score([1,0,1,1], [1,0,0,1])
precision = precision_score([1,0,1,1], [1,0,0,1])
recall = recall_score([1,0,1,1], [1,0,0,1])
f1 = f1_score([1,0,1,1], [1,0,0,1])
This program compares true labels and predicted labels, then prints four common evaluation scores to see how well the model did.
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score # True labels (actual categories) true_labels = [1, 0, 1, 1, 0, 1, 0, 0, 1, 0] # Predicted labels by the model predicted_labels = [1, 0, 1, 0, 0, 1, 1, 0, 1, 0] # Calculate evaluation metrics accuracy = accuracy_score(true_labels, predicted_labels) precision = precision_score(true_labels, predicted_labels) recall = recall_score(true_labels, predicted_labels) f1 = f1_score(true_labels, predicted_labels) print(f"Accuracy: {accuracy:.2f}") print(f"Precision: {precision:.2f}") print(f"Recall: {recall:.2f}") print(f"F1 Score: {f1:.2f}")
Accuracy can be misleading if classes are imbalanced (one class is much bigger).
Precision is important when false positives are costly (e.g., wrongly flagging emails as spam).
Recall is important when missing positives is costly (e.g., missing sick patients).
Accuracy shows overall correct predictions.
Precision and recall focus on positive class quality.
F1 score balances precision and recall into one number.