For object detection tasks using a custom dataset, the key metrics are Precision, Recall, and Mean Average Precision (mAP). Precision tells us how many detected objects are actually correct, while Recall tells us how many real objects were found. mAP summarizes how well the model detects objects across different classes and confidence levels. These metrics matter because detection involves both locating and classifying objects, so we need to balance finding all objects (Recall) and avoiding false alarms (Precision).
0
0
Custom detection dataset in PyTorch - Model Metrics & Evaluation
Metrics & Evaluation - Custom detection dataset
Which metric matters for Custom Detection Dataset and WHY
Confusion Matrix for Object Detection
In detection, confusion matrix is more complex because predictions include bounding boxes and classes. But for a single class, it can be simplified as:
+----------------+----------------+
| | Predicted Obj |
| | Yes No |
+----------------+----------------+
| Actual Obj Yes | TP FN |
| Actual Obj No | FP TN |
+----------------+----------------+
Where:
- TP: Correctly detected objects with good overlap (IoU > threshold)
- FP: Wrong detections or duplicates
- FN: Missed objects
- TN: Not usually counted in detection
Precision vs Recall Tradeoff with Examples
Imagine a security camera detecting people:
- High Precision, Low Recall: The camera only alerts when very sure. Few false alarms, but might miss some people.
- High Recall, Low Precision: The camera alerts on many things, catching almost all people but also many false alarms.
Choosing the right balance depends on the use case. For safety, high recall is better to not miss anyone. For convenience, high precision avoids false alerts.
Good vs Bad Metric Values for Custom Detection Dataset
- Good: Precision and Recall above 0.8, mAP above 0.75 means the model detects most objects correctly and rarely makes mistakes.
- Bad: Precision or Recall below 0.5 means many false detections or many missed objects. mAP below 0.4 shows poor overall detection quality.
Common Pitfalls in Detection Metrics
- Accuracy Paradox: High accuracy can be misleading if most images have no objects. The model can guess "no object" and be right often but useless.
- Data Leakage: Using test images in training inflates metrics falsely.
- Overfitting: Very high training metrics but low test metrics means the model memorizes training data and won't generalize.
- Ignoring IoU Threshold: Counting detections with low overlap as correct inflates metrics unfairly.
Self Check
Your detection model has 98% accuracy but only 12% recall on detecting cars. Is it good for production?
Answer: No. The high accuracy is misleading because most images might have no cars, so guessing "no car" is often right. The very low recall means the model misses most cars, which is bad for detection. You should improve recall before using it in production.
Key Result
Precision, Recall, and mAP are key to evaluate detection models; balance between them shows true performance.