In multi-class classification, the model predicts one label out of many possible classes. We want to know how often it picks the right class. Accuracy is a simple metric that shows the percentage of correct predictions overall.
However, accuracy alone can hide problems if some classes appear much more than others. So, we also use Precision, Recall, and F1-score for each class. These tell us how well the model finds each class without mixing them up.
For example, if the model often confuses class A with class B, precision and recall for those classes will be low. This helps us understand where the model struggles.