0
0
NLPml~8 mins

Entity linking concept in NLP - Model Metrics & Evaluation

Choose your learning style9 modes available
Metrics & Evaluation - Entity linking concept
Which metric matters for Entity Linking and WHY

Entity linking matches names in text to real-world entities. The key metrics are Precision and Recall. Precision tells us how many linked entities are correct. Recall tells us how many true entities were found. Both matter because linking wrong entities (low precision) confuses users, and missing entities (low recall) loses important info.

Confusion Matrix for Entity Linking
      | Predicted Linked | Predicted Not Linked |
  ----|------------------|----------------------|
  True Linked    | TP = 80           | FN = 20             |
  True Not Linked| FP = 10           | TN = 90             |

  Total samples = 80 + 20 + 10 + 90 = 200

  Precision = TP / (TP + FP) = 80 / (80 + 10) = 0.89
  Recall = TP / (TP + FN) = 80 / (80 + 20) = 0.80
  F1 Score = 2 * (0.89 * 0.80) / (0.89 + 0.80) ≈ 0.84
    
Precision vs Recall Tradeoff with Examples

If the system links only when very sure, precision is high but recall is low. This means fewer wrong links but many missed entities.

If the system links more aggressively, recall is high but precision drops. This means more entities found but more wrong links.

Example: In a news app, high precision is important to avoid wrong info. In a research tool, high recall is important to find all relevant entities.

Good vs Bad Metric Values for Entity Linking

Good: Precision and recall both above 0.85 means most entities are correctly linked and few are missed.

Bad: Precision below 0.5 means many wrong links confuse users. Recall below 0.5 means many entities are missed, losing info.

Common Pitfalls in Entity Linking Metrics
  • Accuracy paradox: High accuracy can happen if most text has no entities, but model misses many entities.
  • Data leakage: Using test data entities in training inflates metrics falsely.
  • Overfitting: Model memorizes training entities but fails on new ones, causing low recall on real data.
Self Check

Your entity linking model has 98% accuracy but only 12% recall on entities. Is it good?

Answer: No. The high accuracy is misleading because most text has no entities. The very low recall means the model misses almost all entities, so it is not useful.

Key Result
Precision and recall are key for entity linking; balance them to avoid wrong links and missed entities.