For hand and face landmark detection, the key metric is Mean Squared Error (MSE) or Normalized Mean Error (NME). These measure how close the predicted points are to the true points on the hand or face.
We want the predicted landmarks to be as close as possible to the real landmarks, so smaller error means better model.
Sometimes, Percentage of Correct Keypoints (PCK) is used. It counts how many points fall within a certain distance from the true points, showing accuracy in a more intuitive way.