0
0
MlopsConceptBeginner · 3 min read

F1 Score in Machine Learning with Python: Definition and Example

The F1 score in machine learning is a metric that balances precision and recall into a single number, showing how well a model performs on classification tasks. In Python, you can calculate it easily using sklearn.metrics.f1_score.
⚙️

How It Works

The F1 score combines two important ideas: precision and recall. Imagine you are sorting apples and oranges. Precision tells you how many of the apples you picked are actually apples, while recall tells you how many of all the apples you found. The F1 score is like a balance between these two, giving one number that shows how good your sorting is overall.

This is useful because sometimes a model might be very precise but miss many true cases (low recall), or find many true cases but also many wrong ones (low precision). The F1 score helps you see the trade-off clearly.

💻

Example

This example shows how to calculate the F1 score in Python using sklearn. We create some true labels and predicted labels, then compute the F1 score.

python
from sklearn.metrics import f1_score

# True labels
true_labels = [0, 1, 1, 0, 1, 0, 1, 1]
# Predicted labels by the model
predicted_labels = [0, 0, 1, 0, 1, 1, 1, 0]

# Calculate F1 score
score = f1_score(true_labels, predicted_labels)
print(f"F1 Score: {score:.2f}")
Output
F1 Score: 0.67
🎯

When to Use

Use the F1 score when you want a balance between precision and recall, especially if you care equally about false positives and false negatives. It is very helpful in cases like spam detection, medical diagnosis, or fraud detection where both missing true cases and wrongly labeling cases are important.

For example, in medical tests, you want to catch as many sick patients as possible (high recall) but also avoid wrongly diagnosing healthy people (high precision). The F1 score helps measure this balance.

Key Points

  • The F1 score is the harmonic mean of precision and recall.
  • It ranges from 0 (worst) to 1 (best).
  • It is useful when you want to balance false positives and false negatives.
  • In sklearn, use f1_score from sklearn.metrics to calculate it easily.

Key Takeaways

The F1 score balances precision and recall into one metric for classification.
Use sklearn's f1_score function to calculate it easily in Python.
It is best for tasks where false positives and false negatives both matter.
F1 score ranges from 0 to 1, with 1 being perfect performance.
It helps evaluate models when you want a balanced view of errors.