For image classification using CNNs, accuracy is often the main metric. It tells us how many images the model labels correctly out of all images. But accuracy alone can be misleading if classes are unbalanced.
So, we also look at precision and recall for each class. Precision shows how many predicted images of a class are actually correct. Recall shows how many images of a class the model found out of all that class has. The F1 score balances precision and recall.
These metrics help us understand if the CNN is good at recognizing images correctly and not mixing classes.