F1 Score in Machine Learning with Python: Definition and Example
F1 score in machine learning is a metric that balances precision and recall into a single number, showing how well a model performs on classification tasks. In Python, you can calculate it easily using sklearn.metrics.f1_score.How It Works
The F1 score combines two important ideas: precision and recall. Imagine you are sorting apples and oranges. Precision tells you how many of the apples you picked are actually apples, while recall tells you how many of all the apples you found. The F1 score is like a balance between these two, giving one number that shows how good your sorting is overall.
This is useful because sometimes a model might be very precise but miss many true cases (low recall), or find many true cases but also many wrong ones (low precision). The F1 score helps you see the trade-off clearly.
Example
This example shows how to calculate the F1 score in Python using sklearn. We create some true labels and predicted labels, then compute the F1 score.
from sklearn.metrics import f1_score # True labels true_labels = [0, 1, 1, 0, 1, 0, 1, 1] # Predicted labels by the model predicted_labels = [0, 0, 1, 0, 1, 1, 1, 0] # Calculate F1 score score = f1_score(true_labels, predicted_labels) print(f"F1 Score: {score:.2f}")
When to Use
Use the F1 score when you want a balance between precision and recall, especially if you care equally about false positives and false negatives. It is very helpful in cases like spam detection, medical diagnosis, or fraud detection where both missing true cases and wrongly labeling cases are important.
For example, in medical tests, you want to catch as many sick patients as possible (high recall) but also avoid wrongly diagnosing healthy people (high precision). The F1 score helps measure this balance.
Key Points
- The F1 score is the harmonic mean of precision and recall.
- It ranges from 0 (worst) to 1 (best).
- It is useful when you want to balance false positives and false negatives.
- In sklearn, use
f1_scorefromsklearn.metricsto calculate it easily.