0
0
PyTorchml~8 mins

Best model saving pattern in PyTorch - Model Metrics & Evaluation

Choose your learning style9 modes available
Metrics & Evaluation - Best model saving pattern
Which metric matters for this concept and WHY

When saving the best model during training, the key metric to track is the validation metric that best reflects your goal. For example, if you want a model that predicts correctly, use validation accuracy. If you want to catch rare events, use validation recall. Saving the model with the best value of this metric ensures you keep the most useful version.

Confusion matrix or equivalent visualization (ASCII)
    Confusion Matrix Example:

          Predicted
          P     N
    Actual P  TP    FN
           N  FP    TN

    TP = True Positives
    FP = False Positives
    TN = True Negatives
    FN = False Negatives

    Use this matrix to calculate metrics like accuracy, precision, recall.
    The best model saving pattern depends on which metric you want to maximize.
    
Precision vs Recall tradeoff with concrete examples

Choosing which metric to save your best model on depends on your problem:

  • High Precision: Save model with highest precision if you want few false alarms. Example: Spam filter that should not mark good emails as spam.
  • High Recall: Save model with highest recall if you want to catch as many positives as possible. Example: Cancer detector that should not miss any cancer cases.
  • Balanced (F1 score): Save model with best F1 score if you want a balance between precision and recall.
What "good" vs "bad" metric values look like for this use case

Good model saving pattern means:

  • Saving the model checkpoint only when the validation metric improves.
  • Not saving models that perform worse or the same as previous best.
  • Using early stopping to avoid overfitting.

Bad pattern examples:

  • Saving model every epoch regardless of metric.
  • Saving based on training metric instead of validation metric.
  • Ignoring metric fluctuations and saving worse models.
Metrics pitfalls (accuracy paradox, data leakage, overfitting indicators)
  • Accuracy paradox: High accuracy can be misleading if data is imbalanced. Saving best model by accuracy alone may not help.
  • Data leakage: If validation data leaks into training, the saved "best" model may not generalize.
  • Overfitting: Model saved with best validation metric may still overfit if metric fluctuates or validation set is small.
  • Metric choice: Saving based on wrong metric (e.g., training loss) can save a poor model.
Self-check: Your model has 98% accuracy but 12% recall on fraud. Is it good?

No, it is not good for fraud detection. Although accuracy is high, recall is very low. This means the model misses most fraud cases, which is dangerous. You should save and select models based on recall or a metric that values catching fraud cases.

Key Result
Save the model checkpoint that achieves the best validation metric aligned with your goal (e.g., accuracy, recall, or F1) to ensure the best real-world performance.