For pre-training and fine-tuning, the key metrics depend on the task the model is fine-tuned for. Common metrics include accuracy for classification, loss for general learning progress, and task-specific metrics like BLEU for language generation or F1 score for imbalanced classes.
During pre-training, loss (like cross-entropy) is important to see if the model is learning general patterns. During fine-tuning, task-specific metrics matter more because they show how well the model adapts to the new task.