For MediaPipe Pose, the key metric is Mean Average Precision (mAP) or Percentage of Correct Keypoints (PCK). These metrics measure how accurately the model detects body landmarks compared to the true positions.
We care about these because the goal is to find exact points on the body like elbows or knees. If the points are off, the pose estimation is wrong. So, accuracy in locating these points is critical.
Also, inference speed matters because pose detection often runs live on video. A slow model makes the experience laggy.