0
0
Elasticsearchquery~10 mins

TF-IDF and BM25 scoring in Elasticsearch - Step-by-Step Execution

Choose your learning style9 modes available
Concept Flow - TF-IDF and BM25 scoring
Input Query
Document Collection
Calculate Term Frequency (TF)
Calculate Inverse Document Frequency (IDF)
Compute TF-IDF Score
Apply BM25 Formula (TF, IDF, doc length)
Rank Documents by Score
Return Top Results
The flow shows how Elasticsearch scores documents by first calculating term frequency and inverse document frequency, then combining them using TF-IDF or BM25 formulas to rank documents.
Execution Sample
Elasticsearch
GET /my_index/_search
{
  "query": {
    "match": { "text": "apple banana" }
  }
}
This query searches documents containing 'apple' and 'banana' and scores them using BM25 by default.
Execution Table
StepActionTermTFIDFBM25 ScoreExplanation
1Calculate TF for 'apple' in Doc1apple32.00Count how many times 'apple' appears in Doc1
2Calculate TF for 'banana' in Doc1banana11.50Count how many times 'banana' appears in Doc1
3Calculate IDF for 'apple'apple-2.00Inverse document frequency for 'apple'
4Calculate IDF for 'banana'banana-1.50Inverse document frequency for 'banana'
5Compute BM25 score for 'apple' in Doc1apple32.02.5Apply BM25 formula with TF=3, IDF=2.0, doc length normalization
6Compute BM25 score for 'banana' in Doc1banana11.51.2Apply BM25 formula with TF=1, IDF=1.5, doc length normalization
7Sum BM25 scores for Doc1---3.7Total BM25 score for Doc1 is sum of term scores
8Repeat steps 1-7 for Doc2---2.1Calculate scores for another document
9Rank documents by BM25 score----Doc1 (3.7) ranks higher than Doc2 (2.1)
10Return top ranked documents----Results returned to user sorted by score
💡 All documents scored and ranked; top results returned.
Variable Tracker
VariableStartAfter Step 1After Step 5After Step 7Final
TF_apple_Doc103333
TF_banana_Doc101111
IDF_apple--2.02.02.0
IDF_banana--1.51.51.5
BM25_apple_Doc1002.52.52.5
BM25_banana_Doc10001.21.2
BM25_total_Doc10003.73.7
Key Moments - 3 Insights
Why does BM25 score consider document length while TF-IDF does not?
BM25 adjusts term frequency by document length to avoid bias toward longer documents, as shown in steps 5 and 6 where BM25 formula includes normalization, unlike simple TF-IDF.
Why is IDF important in scoring?
IDF reduces the weight of common terms across documents, making rare terms more influential, as seen in steps 3 and 4 where IDF values differ per term.
How are scores combined for multiple terms?
Scores for each term are calculated separately then summed to get the document's total score, demonstrated in step 7.
Visual Quiz - 3 Questions
Test your understanding
Look at the execution table, what is the BM25 score for 'apple' in Doc1 at step 5?
A1.5
B3
C2.5
D0
💡 Hint
Check the BM25 Score column at step 5 in the execution_table.
At which step does the total BM25 score for Doc1 get calculated?
AStep 6
BStep 7
CStep 9
DStep 3
💡 Hint
Look for the row where BM25_total_Doc1 is updated in the variable_tracker and execution_table.
If the term frequency of 'banana' in Doc1 increased to 4, how would the BM25 score at step 6 change?
AIt would increase
BIt would decrease
CIt would stay the same
DIt would become zero
💡 Hint
Higher term frequency increases BM25 score as seen in steps 1, 2, 5, and 6.
Concept Snapshot
TF-IDF and BM25 scoring in Elasticsearch:
- TF counts term appearances in a doc
- IDF measures term rarity across docs
- TF-IDF multiplies TF by IDF
- BM25 improves TF-IDF by normalizing for doc length
- Elasticsearch uses BM25 by default to rank search results
Full Transcript
This visual execution shows how Elasticsearch scores documents using TF-IDF and BM25. First, it counts how often each search term appears in a document (TF). Then, it calculates how rare each term is across all documents (IDF). BM25 scoring combines these with adjustments for document length to avoid favoring longer documents. Each term's BM25 score is computed and summed to get the document's total score. Documents are then ranked by these scores to return the most relevant results. Key points include the importance of IDF to reduce common term weight, BM25's length normalization, and summing scores for multiple terms. The execution table traces these calculations step-by-step for clarity.