Bird
0
0

You want to improve search relevance for a collection of very short documents using Elasticsearch. Which scoring approach and adjustment is best to handle short documents effectively?

hard🚀 Application Q15 of 15
Elasticsearch - Search Results and Scoring
You want to improve search relevance for a collection of very short documents using Elasticsearch. Which scoring approach and adjustment is best to handle short documents effectively?
AUse BM25 with a higher k1 parameter to ignore term frequency saturation.
BUse BM25 with a lower b parameter to reduce length normalization effect.
CUse TF-IDF similarity because it handles short documents better by default.
DUse TF-IDF and disable inverse document frequency to boost short docs.
Step-by-Step Solution
Solution:
  1. Step 1: Understand BM25 parameters for short docs

    BM25's b parameter controls length normalization. Lowering b reduces penalty on short documents, improving their scores.
  2. Step 2: Compare to TF-IDF and other options

    TF-IDF lacks length normalization control. Increasing k1 ignores saturation, which can hurt relevance. Disabling IDF is incorrect.
  3. Final Answer:

    Use BM25 with a lower b parameter to reduce length normalization effect. -> Option B
  4. Quick Check:

    Lower b helps short docs in BM25 [OK]
Quick Trick: Lower BM25 b parameter to favor short documents [OK]
Common Mistakes:
MISTAKES
  • Assuming TF-IDF is better for short docs by default
  • Increasing k1 to ignore saturation hurts relevance
  • Disabling IDF reduces search quality

Want More Practice?

15+ quiz questions · All difficulty levels · Free

Free Signup - Practice All Questions
More Elasticsearch Quizzes