When training an image classifier, the main goal is to correctly identify images into their right categories. The key metrics to watch are accuracy, precision, recall, and F1 score.
Accuracy tells us the overall percentage of images the model got right. But accuracy alone can be misleading if some classes appear much more than others.
Precision measures how many images the model labeled as a certain class actually belong to that class. This is important when false alarms are costly.
Recall shows how many images of a class the model successfully found out of all images that truly belong to that class. This matters when missing a class is bad.
F1 score balances precision and recall, giving a single number to understand the model's quality on each class.
For image classifiers, especially with multiple classes, it's good to look at these metrics per class and also overall.