0
0
ML Pythonml~8 mins

Bagging concept in ML Python - Model Metrics & Evaluation

Choose your learning style9 modes available
Metrics & Evaluation - Bagging concept
Which metric matters for Bagging and WHY

Bagging helps reduce errors by combining many models. The main goal is to lower variance and improve accuracy. So, accuracy and error rate are key metrics to check if bagging works well. For classification, accuracy, precision, and recall show how well the combined model predicts. For regression, mean squared error (MSE) or mean absolute error (MAE) tell us how close predictions are to true values.

Confusion matrix example for Bagging

Imagine a bagging model classifying emails as spam or not spam. Here is a confusion matrix from 100 emails:

      | Predicted Spam | Predicted Not Spam |
      |----------------|--------------------|
      | True Positives (TP) = 40           |
      | False Positives (FP) = 5           |
      | False Negatives (FN) = 10          |
      | True Negatives (TN) = 45           |
    

Totals: TP + FP + FN + TN = 40 + 5 + 10 + 45 = 100 emails.

From this, we calculate:

  • Precision = TP / (TP + FP) = 40 / (40 + 5) = 0.89
  • Recall = TP / (TP + FN) = 40 / (40 + 10) = 0.80
  • Accuracy = (TP + TN) / Total = (40 + 45) / 100 = 0.85
Precision vs Recall tradeoff in Bagging

Bagging usually improves recall by reducing missed cases because it combines many models. But sometimes, it may lower precision if it predicts too many positives.

For example, in medical tests, missing a sick patient (low recall) is worse than a false alarm (low precision). Bagging helps catch more sick patients by increasing recall.

In spam detection, high precision is important to avoid marking good emails as spam. Bagging can be tuned to balance this tradeoff by adjusting thresholds.

Good vs Bad metric values for Bagging

Good values:

  • Accuracy above 85% on test data shows bagging improved predictions.
  • Precision and recall both above 80% means balanced and reliable predictions.
  • Lower error rates compared to a single model show bagging reduced variance.

Bad values:

  • Accuracy close to random guessing (e.g., 50% for two classes) means bagging did not help.
  • Very high precision but very low recall means many true cases are missed.
  • High error rates or unstable results on new data suggest overfitting or poor bagging setup.
Common pitfalls in Bagging metrics
  • Accuracy paradox: High accuracy can be misleading if data is imbalanced. For example, if 95% of emails are not spam, a model always predicting not spam gets 95% accuracy but is useless.
  • Data leakage: If test data leaks into training, bagging looks better than it really is.
  • Overfitting: Bagging reduces overfitting but if base models are too complex, combined model may still overfit.
  • Ignoring variance: Bagging mainly reduces variance, so metrics should be checked on new unseen data, not just training data.
Self-check question

Your bagging model has 98% accuracy but only 12% recall on fraud cases. Is it good for production?

Answer: No, it is not good. Even though accuracy is high, the model misses 88% of fraud cases (low recall). For fraud detection, catching fraud (high recall) is critical. This model would let most fraud slip through.

Key Result
Bagging improves accuracy by reducing variance; key metrics are accuracy, precision, and recall to ensure balanced, reliable predictions.