0
0
ML Pythonml~8 mins

Semi-supervised learning basics in ML Python - Model Metrics & Evaluation

Choose your learning style9 modes available
Metrics & Evaluation - Semi-supervised learning basics
Which metric matters for Semi-supervised learning and WHY

Semi-supervised learning uses both labeled and unlabeled data. The key metric depends on the task, often classification accuracy, precision, recall, or F1 score. We focus on metrics that show how well the model learns from limited labels and generalizes to new data. For example, if the goal is to find rare cases, recall is important. If avoiding false alarms matters, precision is key. Accuracy alone can be misleading if classes are imbalanced.

Confusion matrix example
       Predicted
       Pos   Neg
Actual Pos  40    10
       Neg  15    35

Total samples = 40 + 10 + 15 + 35 = 100

Precision = TP / (TP + FP) = 40 / (40 + 15) = 0.727
Recall = TP / (TP + FN) = 40 / (40 + 10) = 0.8
F1 = 2 * (0.727 * 0.8) / (0.727 + 0.8) ≈ 0.761
Accuracy = (TP + TN) / Total = (40 + 35) / 100 = 0.75
Precision vs Recall tradeoff with examples

In semi-supervised learning, the model may guess labels for unlabeled data. If it guesses too many positives, precision drops (more false alarms). If it guesses too few, recall drops (misses real positives).

Example 1: Detecting spam emails. High precision means few good emails marked as spam. Better to avoid false alarms, so precision matters more.

Example 2: Detecting diseases. High recall means catching most sick patients. Missing a sick patient is worse, so recall matters more.

Good vs Bad metric values for Semi-supervised learning

Good: Balanced precision and recall above 0.7, F1 score above 0.7, accuracy reflecting true performance on labeled and unlabeled data.

Bad: High accuracy but very low recall or precision, indicating the model ignores minority classes or guesses poorly on unlabeled data.

Common pitfalls in metrics for Semi-supervised learning
  • Accuracy paradox: High accuracy can hide poor performance on rare classes.
  • Data leakage: Using unlabeled data incorrectly can leak test info, inflating metrics.
  • Overfitting: Model fits labeled data too closely but fails on unlabeled data, causing misleading metrics.
Self-check question

Your semi-supervised model has 98% accuracy but only 12% recall on the positive class (rare cases). Is it good for production? Why or why not?

Answer: No, it is not good. The model misses most positive cases (low recall), which is critical if those cases matter. High accuracy is misleading because negatives dominate the data.

Key Result
In semi-supervised learning, balanced precision and recall are key to ensure the model learns well from limited labels and generalizes properly.