PyTorchml~8 mins

Custom detection dataset in PyTorch - Model Metrics & Evaluation

Choose your learning style9 modes available

Learn Why Deep Model Try Challenge Experiment Recall Metrics

Metrics & Evaluation - Custom detection dataset

Which metric matters for Custom Detection Dataset and WHY

For object detection tasks using a custom dataset, the key metrics are Precision, Recall, and Mean Average Precision (mAP). Precision tells us how many detected objects are actually correct, while Recall tells us how many real objects were found. mAP summarizes how well the model detects objects across different classes and confidence levels. These metrics matter because detection involves both locating and classifying objects, so we need to balance finding all objects (Recall) and avoiding false alarms (Precision).

Confusion Matrix for Object Detection

In detection, confusion matrix is more complex because predictions include bounding boxes and classes. But for a single class, it can be simplified as:

      +----------------+----------------+
      |                | Predicted Obj  |
      |                | Yes      No    |
      +----------------+----------------+
      | Actual Obj Yes | TP       FN    |
      | Actual Obj No  | FP       TN    |
      +----------------+----------------+

Where:

TP: Correctly detected objects with good overlap (IoU > threshold)
FP: Wrong detections or duplicates
FN: Missed objects
TN: Not usually counted in detection

Precision vs Recall Tradeoff with Examples

Imagine a security camera detecting people:

High Precision, Low Recall: The camera only alerts when very sure. Few false alarms, but might miss some people.
High Recall, Low Precision: The camera alerts on many things, catching almost all people but also many false alarms.

Choosing the right balance depends on the use case. For safety, high recall is better to not miss anyone. For convenience, high precision avoids false alerts.

Good vs Bad Metric Values for Custom Detection Dataset

Good: Precision and Recall above 0.8, mAP above 0.75 means the model detects most objects correctly and rarely makes mistakes.
Bad: Precision or Recall below 0.5 means many false detections or many missed objects. mAP below 0.4 shows poor overall detection quality.

Common Pitfalls in Detection Metrics

Accuracy Paradox: High accuracy can be misleading if most images have no objects. The model can guess "no object" and be right often but useless.
Data Leakage: Using test images in training inflates metrics falsely.
Overfitting: Very high training metrics but low test metrics means the model memorizes training data and won't generalize.
Ignoring IoU Threshold: Counting detections with low overlap as correct inflates metrics unfairly.

Self Check

Your detection model has 98% accuracy but only 12% recall on detecting cars. Is it good for production?

Answer: No. The high accuracy is misleading because most images might have no cars, so guessing "no car" is often right. The very low recall means the model misses most cars, which is bad for detection. You should improve recall before using it in production.

Key Result

Precision, Recall, and mAP are key to evaluate detection models; balance between them shows true performance.