Computer Visionml~8 mins

MixUp strategy in Computer Vision - Model Metrics & Evaluation

Choose your learning style9 modes available

Learn Why Deep Model Try Challenge Experiment Recall Metrics

Metrics & Evaluation - MixUp strategy

Which metric matters for MixUp strategy and WHY

MixUp is a data augmentation method that blends images and their labels. It helps models generalize better. Because it changes training data, the key metrics to watch are validation accuracy and validation loss. These show if the model learns well on new, unseen data. Also, robustness metrics like accuracy on noisy or adversarial examples matter, since MixUp aims to improve model stability.

Confusion matrix example with MixUp

    Actual \ Predicted | Cat | Dog | Bird
    -------------------|-----|-----|-----
    Cat                | 45  | 3   | 2
    Dog                | 4   | 43  | 3
    Bird               | 1   | 2   | 47

    Total samples = 150
    TP (example for Cat) = 45
    FP (Cat predicted but not Cat) = 4 + 1 = 5
    FN (Cat actual but not predicted) = 3 + 2 = 5
    Precision (Cat) = 45 / (45 + 5) = 0.9
    Recall (Cat) = 45 / (45 + 5) = 0.9

This matrix shows balanced precision and recall, indicating MixUp helped the model generalize well across classes.

Precision vs Recall tradeoff with MixUp

MixUp encourages smoother decision boundaries, which can improve both precision and recall. But sometimes, increasing recall (catching more true positives) may lower precision (more false positives). For example:

High precision, low recall: Model is very sure but misses some true cases.
High recall, low precision: Model catches most true cases but also many wrong ones.

MixUp helps balance this tradeoff by making the model less confident on ambiguous samples, improving overall robustness.

What good vs bad metric values look like for MixUp

Good:

Validation accuracy steadily improves or stays stable compared to baseline.
Validation loss decreases smoothly without sudden jumps.
Precision and recall are balanced and high (e.g., above 85%).
Model performs well on noisy or mixed inputs.

Bad:

Validation accuracy drops or fluctuates wildly.
Validation loss is high or unstable.
Precision or recall is very low, showing poor generalization.
Model fails on augmented or mixed samples, indicating overfitting.

Common pitfalls in metrics when using MixUp

Accuracy paradox: Accuracy may look good but model might not handle real mixed data well.
Data leakage: Mixing training and validation data accidentally can inflate metrics.
Overfitting signs: Training loss very low but validation loss high means model memorizes instead of generalizing.
Ignoring robustness: Only checking accuracy on clean data misses MixUp benefits on noisy inputs.

Self-check question

Your model trained with MixUp has 98% accuracy but only 12% recall on a rare class. Is it good for production? Why or why not?

Answer: No, it is not good. High accuracy can be misleading if the rare class is small. Low recall means the model misses most true cases of that class, which can be critical depending on the task. You should improve recall to catch more true positives, even if accuracy drops slightly.

Key Result

MixUp improves model robustness by balancing precision and recall, shown by stable validation accuracy and loss.