Bird
0
0

Why can cosine similarity sometimes give misleading results when comparing very sparse vectors in NLP tasks?

hard📝 Conceptual Q10 of 15
NLP - Text Similarity and Search
Why can cosine similarity sometimes give misleading results when comparing very sparse vectors in NLP tasks?
ABecause cosine similarity ignores vector length completely
BBecause sparse vectors may have little overlap but still high cosine similarity due to normalization
CBecause cosine similarity is sensitive to vector magnitude differences
DBecause cosine similarity requires dense vectors only
Step-by-Step Solution
Solution:
  1. Step 1: Understand sparse vector behavior

    Sparse vectors often have many zeros; normalization can inflate similarity even with little overlap in non-zero elements.
  2. Step 2: Identify why cosine can mislead

    Vectors with few shared features can still have relatively high cosine similarity due to normalization effects.
  3. Final Answer:

    Because sparse vectors may have little overlap but still high cosine similarity due to normalization -> Option B
  4. Quick Check:

    Sparse vectors can mislead cosine similarity [OK]
Quick Trick: Normalization can inflate similarity for sparse vectors [OK]
Common Mistakes:
MISTAKES
  • Thinking cosine ignores vector length entirely
  • Assuming cosine requires dense vectors
  • Confusing sensitivity to magnitude with normalization effect

Want More Practice?

15+ quiz questions · All difficulty levels · Free

Free Signup - Practice All Questions
More NLP Quizzes