FastText embeddings create word vectors that capture meaning. To check how good these vectors are, we use cosine similarity. It measures how close two word vectors are in meaning. A higher cosine similarity means words are more related. For tasks using FastText, like text classification, we also check accuracy or F1 score to see how well the model understands text using these embeddings.
FastText embeddings in NLP - Model Metrics & Evaluation
| Predicted Positive | Predicted Negative |
|--------------------|--------------------|
| True Positive (TP) = 80 | False Negative (FN) = 20 |
| False Positive (FP) = 10 | True Negative (TN) = 90 |
Total samples = 80 + 20 + 10 + 90 = 200
From this matrix, we calculate:
- Precision = TP / (TP + FP) = 80 / (80 + 10) = 0.89
- Recall = TP / (TP + FN) = 80 / (80 + 20) = 0.80
- F1 Score = 2 * (Precision * Recall) / (Precision + Recall) ≈ 0.84
Imagine a spam detector using FastText embeddings:
- High Precision: Few good emails are wrongly marked as spam. Users don't miss important emails.
- High Recall: Most spam emails are caught. Less spam reaches the inbox.
Depending on what matters more, we adjust the model. For spam, high precision is often preferred to avoid losing good emails. For medical text classification, high recall is critical to catch all important cases.
Good metrics mean the embeddings help the model understand text well:
- Good: Accuracy > 85%, F1 score > 0.8, cosine similarity between related words > 0.7
- Bad: Accuracy < 60%, F1 score < 0.5, cosine similarity between related words < 0.3
Bad values suggest embeddings do not capture meaning well or the model is not learning properly.
- Accuracy paradox: High accuracy can be misleading if classes are imbalanced.
- Data leakage: Using test data during training inflates metrics falsely.
- Overfitting: Very high training accuracy but low test accuracy means the model memorizes instead of generalizing.
- Ignoring semantic similarity: Only checking classification metrics misses how well embeddings capture word meaning.
Your text classification model using FastText embeddings has 98% accuracy but only 12% recall on the positive class. Is it good for production? Why or why not?
Answer: No, it is not good. The low recall means the model misses most positive cases, which can be critical depending on the task. High accuracy alone is misleading if the positive class is rare.