When fine-tuning a computer vision model, the key metrics to watch are accuracy, precision, and recall. Accuracy shows how often the model predicts correctly overall. Precision tells us how many of the positive predictions are actually correct. Recall shows how many of the actual positive cases the model finds.
We focus on these because fine-tuning adjusts a pre-trained model to a new task or dataset. We want to see if the model improves in recognizing the new classes without losing its ability to avoid mistakes. Depending on the task, precision or recall might be more important. For example, in medical image diagnosis, recall is critical to catch all cases.