In action recognition, we want to know how well the model identifies the correct action from videos or image sequences. The main metric is accuracy, which tells us the percentage of correctly predicted actions out of all attempts.
However, accuracy alone can be misleading if some actions happen more often than others. So, we also use precision and recall for each action class to understand if the model is good at finding the right actions without too many mistakes.
Precision tells us: When the model says an action happened, how often is it right?
Recall tells us: Out of all times an action actually happened, how many did the model find?
Finally, the F1 score balances precision and recall, giving a single number to compare models.