When using vector stores for long-term memory, the key metric is Recall. This is because we want to find all relevant past information stored as vectors when a query comes in. Missing important memories means the system forgets useful knowledge.
Another important metric is Precision, which tells us how many retrieved memories are actually relevant. High precision means fewer distractions from unrelated memories.
We also look at F1 score to balance recall and precision, ensuring the memory retrieval is both complete and accurate.
For ranking results, Mean Average Precision (MAP) or Normalized Discounted Cumulative Gain (NDCG) can measure how well the most relevant memories appear at the top.