Mask R-CNN is used to find objects and their exact shapes in images. So, we care about how well it finds the right objects and how well it draws their shapes.
The key metrics are:
- Mean Average Precision (mAP): Measures how well the model finds and correctly labels objects. It combines precision and recall over different thresholds.
- Intersection over Union (IoU): Measures how closely the predicted mask matches the true object shape. Higher IoU means better shape accuracy.
- Precision and Recall: Precision tells us how many predicted objects are correct. Recall tells us how many true objects were found.
We use these because Mask R-CNN does two things: detect objects and segment their shapes. Both need to be accurate.