Albumentations helps improve model performance by changing images in smart ways before training. The key metrics to watch are validation accuracy and validation loss. These show if the model learns better with augmented images. Also, robustness metrics like accuracy on new or noisy images matter because augmentation aims to make models strong against changes.
Albumentations library in Computer Vision - Model Metrics & Evaluation
Actual \ Predicted | Cat | Dog | Rabbit
-------------------------------------
Cat | 45 | 3 | 2
Dog | 4 | 40 | 6
Rabbit | 1 | 5 | 44
This confusion matrix shows how well the model classifies images after using Albumentations. More correct predictions (diagonal) mean better augmentation helped the model learn.
When using Albumentations, you want to keep a balance between precision and recall. For example, if you augment images too much, the model might get confused and lower precision (more false alarms). If you augment too little, recall might drop (missing real cases). Good augmentation finds a sweet spot to keep both high.
Example: In a dog detector, high precision means fewer wrong dog labels, high recall means catching most dogs. Albumentations helps by showing varied dog images during training.
- Good: Validation accuracy above 85%, balanced precision and recall above 80%, and stable loss decrease during training.
- Bad: Validation accuracy below 70%, big gap between precision and recall (like 90% vs 50%), or validation loss increasing (overfitting or bad augmentation).
Good metrics mean Albumentations helped the model learn useful features. Bad metrics suggest augmentation was too weak, too strong, or not suitable.
- Accuracy paradox: High accuracy but poor recall on rare classes means augmentation missed important cases.
- Data leakage: Using augmented versions of test images in training inflates metrics falsely.
- Overfitting: If validation loss rises while training loss falls, augmentation might be too weak or inconsistent.
- Wrong augmentation: Using unrealistic transformations can confuse the model and hurt metrics.
Your model trained with Albumentations has 98% accuracy but only 12% recall on the rare class. Is it good?
Answer: No. The low recall means the model misses most rare cases, which is bad especially if those cases matter. High accuracy can be misleading if the rare class is small. You should adjust augmentation or model to improve recall.