In computer vision, architecture design affects how well a model learns and predicts. Key metrics include accuracy for overall correctness, precision and recall for class-specific performance, and F1 score to balance precision and recall. These metrics show if the architecture extracts useful features and generalizes well.
Why architecture design impacts performance in Computer Vision - Why Metrics Matter
Predicted
| Cat | Dog |
---+-----+-----+
Cat| 50 | 10 |
Dog| 5 | 35 |
TP (Cat) = 50, FP (Cat) = 10, FN (Cat) = 5, TN (Cat) = 35
This matrix helps calculate precision and recall for each class, showing how architecture impacts correct and wrong predictions.
A complex architecture might improve recall by finding more true objects but lower precision by adding false detections. A simpler design might have high precision but miss some objects (low recall). Choosing architecture depends on whether missing objects or false alarms are worse.
Example: In face recognition, high precision avoids false matches, but in medical image detection, high recall avoids missing diseases.
Good: Accuracy above 90%, precision and recall balanced above 85%, F1 score high. This means the architecture captures features well and predicts reliably.
Bad: Accuracy high but recall very low (e.g., 40%), or precision very low. This shows the architecture misses many true cases or makes many false alarms, hurting performance.
- Overfitting: Complex architectures may memorize training data, showing high accuracy but poor real-world results.
- Data leakage: If test data leaks into training, metrics look falsely good, hiding architecture flaws.
- Ignoring class imbalance: Accuracy can be misleading if one class dominates; precision and recall give clearer insight.
Your model has 98% accuracy but only 12% recall on detecting a rare object. Is it good for production?
Answer: No. The model misses most true objects (low recall), so it fails its purpose despite high accuracy. The architecture likely does not capture important features for that object.