PyTorchml~8 mins

Non-maximum suppression in PyTorch - Model Metrics & Evaluation

Choose your learning style9 modes available

Learn Why Deep Model Try Challenge Experiment Recall Metrics

Metrics & Evaluation - Non-maximum suppression

Which metric matters for Non-maximum suppression and WHY

Non-maximum suppression (NMS) is used in object detection to remove overlapping boxes and keep only the best ones. The key metrics to evaluate NMS are Precision and Recall. Precision tells us how many of the detected boxes are correct (not false alarms), while Recall tells us how many true objects we found. We want a balance so we keep true objects (high recall) but avoid many overlapping or wrong boxes (high precision). The Intersection over Union (IoU) threshold in NMS controls this balance by deciding when boxes overlap too much and one should be removed.

Confusion matrix for object detection with NMS (simplified)

      | Predicted Object | Predicted No Object |
      |------------------|---------------------|
      | True Positive (TP) | False Positive (FP) |
      | False Negative (FN)| True Negative (TN)  |

      TP: Correct boxes kept after NMS
      FP: Wrong boxes kept (false alarms)
      FN: True boxes removed by NMS (missed objects)
      TN: Background correctly ignored

      Total samples = TP + FP + FN + TN

Precision vs Recall tradeoff in Non-maximum suppression

If the IoU threshold is too low, NMS removes many boxes, increasing precision (fewer false alarms) but lowering recall (missing true objects). For example, in a face detector, too strict NMS might miss some faces (low recall).

If the IoU threshold is too high, NMS keeps many overlapping boxes, increasing recall but lowering precision (more false alarms). For example, in a car detector, too loose NMS might keep many boxes for the same car (low precision).

Choosing the right IoU threshold balances precision and recall depending on the task needs.

What "good" vs "bad" metric values look like for NMS

Good NMS: Precision and recall both above 0.8, meaning most true objects are detected and few false boxes remain.
Bad NMS: Precision below 0.5 means many false boxes remain, cluttering results.
Recall below 0.5 means many true objects are missed, which is bad for safety-critical tasks like pedestrian detection.
Very high recall but very low precision means many duplicates or false alarms.

Common pitfalls when evaluating NMS metrics

Ignoring IoU threshold: Different IoU thresholds change precision and recall drastically, so always report the threshold used.
Overfitting to training data: NMS tuned too tightly on training data may fail on new images.
Confusing precision and recall: Precision is about false alarms, recall is about missed objects.
Not considering class imbalance: If some classes are rare, metrics can be misleading.
Using only accuracy: Accuracy is not meaningful for object detection because background dominates.

Self-check question

Your object detector with NMS has 98% accuracy but only 12% recall on pedestrians. Is it good for production? Why or why not?

Answer: No, it is not good. The high accuracy is misleading because most of the image is background (easy to classify). The very low recall means the detector misses most pedestrians, which is dangerous for applications like self-driving cars. Improving recall while keeping precision reasonable is critical.

Key Result

Non-maximum suppression balances precision and recall by removing overlapping boxes; the IoU threshold controls this tradeoff.