0
0
Computer Visionml~8 mins

SSD concept in Computer Vision - Model Metrics & Evaluation

Choose your learning style9 modes available
Metrics & Evaluation - SSD concept
Which metric matters for SSD and WHY

SSD (Single Shot MultiBox Detector) is used for object detection. It finds objects and draws boxes around them. The key metrics are Precision, Recall, and mAP (mean Average Precision).

Precision tells us how many detected boxes are correct. Recall tells us how many real objects were found. mAP combines these to show overall detection quality across all object types and sizes.

We want high precision to avoid false alarms and high recall to find all objects. mAP summarizes this balance and is the main score to compare SSD models.

Confusion matrix for SSD detection

For object detection, confusion is about boxes:

    +----------------+----------------+
    |                | Predicted Box  |
    |                | Object / No Obj|
    +----------------+----------------+
    | Actual Object  | TP             | FN             |
    +----------------+----------------+
    | Actual No Obj  | FP             | TN             |
    +----------------+----------------+
    

TP = Correct boxes matching real objects (IoU > threshold).
FP = Boxes predicted but no real object there.
FN = Real objects missed by SSD.
TN = Background correctly ignored (usually very large, less focus).

Precision vs Recall tradeoff in SSD

If SSD is tuned to be very sure before drawing a box, it has high precision but might miss objects, so low recall. This means fewer false alarms but more missed detections.

If SSD tries to find every object, it has high recall but may draw many wrong boxes, so low precision. This means fewer misses but more false alarms.

Example: In self-driving cars, missing a pedestrian (low recall) is dangerous, so recall is more important. In photo tagging, false tags (low precision) annoy users, so precision matters more.

Good vs Bad metric values for SSD

Good SSD model:

  • Precision > 0.8 (80% of predicted boxes are correct)
  • Recall > 0.7 (70% of real objects detected)
  • mAP > 0.75 (high overall detection quality)

Bad SSD model:

  • Precision < 0.5 (many false boxes)
  • Recall < 0.4 (many missed objects)
  • mAP < 0.4 (poor detection performance)

Good values depend on dataset difficulty and use case but these are typical ranges.

Common pitfalls in SSD metrics
  • Accuracy paradox: High accuracy can be misleading if background dominates. SSD detects many background pixels correctly but misses objects.
  • IoU threshold choice: Too low threshold inflates TP, too high misses correct boxes.
  • Data leakage: Training and test images overlap, inflating metrics.
  • Overfitting: Model performs well on training but poorly on new images.
  • Ignoring small objects: SSD may miss small objects, lowering recall.
Self-check question

Your SSD model has 98% accuracy but only 12% recall on detecting pedestrians. Is it good for production?

Answer: No. The high accuracy is misleading because most pixels are background. The very low recall means it misses almost all pedestrians, which is dangerous for real use. You need to improve recall even if accuracy drops.

Key Result
For SSD, mean Average Precision (mAP) best shows detection quality by balancing precision and recall.