0
0
Computer Visionml~8 mins

Pre-trained detection models in Computer Vision - Model Metrics & Evaluation

Choose your learning style9 modes available
Metrics & Evaluation - Pre-trained detection models
Which metric matters for Pre-trained detection models and WHY

For pre-trained detection models, the key metrics are Precision, Recall, and F1 score. These models find objects in images, so we want to know how many detected objects are correct (Precision) and how many real objects were found (Recall). The F1 score balances both. Also, mean Average Precision (mAP) is often used to measure overall detection quality across classes and thresholds.

Confusion matrix for object detection (simplified)
      | Predicted Object  | Predicted No Object |
      |------------------|---------------------|
      | True Positive (TP)| False Negative (FN)  |
      | False Positive (FP)| True Negative (TN)  |

    TP: Correctly detected objects
    FP: Wrong detections (false alarms)
    FN: Missed objects
    TN: Correctly ignored background
    

Precision = TP / (TP + FP)

Recall = TP / (TP + FN)

Precision vs Recall tradeoff with examples

If the model is very strict, it detects fewer objects but with high confidence. This means high precision but low recall. For example, in security cameras, you want to avoid false alarms (high precision).

If the model detects many objects, including uncertain ones, it has high recall but low precision. For example, in wildlife monitoring, missing an animal is worse, so high recall is preferred.

The F1 score helps balance these two depending on the use case.

What "good" vs "bad" metric values look like for pre-trained detection models

Good: Precision and Recall both above 0.8, F1 score near 0.85 or higher, and mAP above 0.75. This means the model finds most objects correctly and misses few.

Bad: Precision below 0.5 means many false detections. Recall below 0.5 means many missed objects. Low F1 and mAP indicate poor detection quality.

Common pitfalls in metrics for pre-trained detection models
  • Accuracy paradox: High accuracy can be misleading if most images have no objects (TN dominate).
  • Data leakage: Using test images in training inflates metrics falsely.
  • Overfitting: Very high training metrics but low test metrics show poor generalization.
  • Ignoring IoU threshold: Detection quality depends on Intersection over Union threshold; metrics vary with it.
Self-check question

Your pre-trained detection model has 98% accuracy but only 12% recall on detecting cars. Is it good for production? Why or why not?

Answer: No, it is not good. The high accuracy is misleading because most images may not have cars, so the model correctly predicts no car often (TN). The very low recall means it misses most cars, which is bad for detection tasks where finding objects is critical.

Key Result
Precision, Recall, and mAP are key metrics to evaluate how well pre-trained detection models find and correctly identify objects.