For image classification tasks using datasets like CIFAR-10 and ImageNet, accuracy is the most common metric. This is because these datasets have balanced classes and the goal is to correctly identify the image category.
However, when classes are imbalanced or some mistakes are more costly, precision, recall, and F1 score become important. For example, if missing a rare class is bad, recall matters more.
In large datasets like ImageNet, top-1 and top-5 accuracy are used. Top-1 accuracy checks if the model's best guess is correct. Top-5 accuracy checks if the correct label is among the model's five best guesses, which is useful when many classes look similar.