Highlighting matched text in Elasticsearch - Time & Space Complexity
When we ask about time complexity for highlighting matched text in Elasticsearch, we want to know how the work grows as the data or query size grows.
Specifically, how does Elasticsearch spend more time when it finds and highlights matches in documents?
Analyze the time complexity of the following Elasticsearch query with highlighting.
GET /my_index/_search
{
"query": {
"match": { "content": "search term" }
},
"highlight": {
"fields": { "content": {} }
}
}
This query searches for documents matching "search term" in the content field and highlights the matched parts in the results.
Look at what repeats as the query runs:
- Primary operation: Scanning each document's content to find matches and then marking matched text for highlighting.
- How many times: Once for each document that matches the query and for each highlighted field in that document.
As the number of matching documents grows, Elasticsearch must highlight more text snippets.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | Highlighting in 10 documents |
| 100 | Highlighting in 100 documents (about 10 times more work) |
| 1000 | Highlighting in 1000 documents (about 100 times more work than 10) |
Pattern observation: The work grows roughly in direct proportion to the number of matched documents because each document's text must be scanned and highlighted.
Time Complexity: O(n)
This means the time to highlight matched text grows linearly with the number of documents matched by the query.
[X] Wrong: "Highlighting happens instantly and does not add time as more documents match."
[OK] Correct: Highlighting requires scanning matched text in each document, so more matches mean more work and more time.
Understanding how highlighting scales helps you explain performance trade-offs when searching and displaying results, a useful skill in real projects.
"What if we added highlighting on multiple fields instead of just one? How would the time complexity change?"