In sentiment analysis, we want to know how well the model can correctly identify positive and negative feelings in text. The key metrics are Precision, Recall, and F1-score.
Precision tells us how many of the texts the model labeled as positive (or negative) are actually correct. This is important if we want to avoid false alarms.
Recall tells us how many of the actual positive (or negative) texts the model found. This matters if missing a sentiment is costly.
F1-score balances precision and recall, giving a single number to understand overall performance.
Accuracy is also used but can be misleading if classes are imbalanced (e.g., many more positive than negative reviews).