Full-text search engine concept in Elasticsearch - Time & Space Complexity
When using a full-text search engine like Elasticsearch, it is important to understand how the time it takes to find results changes as the amount of data grows.
We want to know how the search speed changes when we add more documents or search terms.
Analyze the time complexity of the following Elasticsearch query.
GET /library/_search
{
"query": {
"match": {
"content": "climate change"
}
}
}
This query searches the "library" index for documents containing the words "climate change" in the "content" field.
In this search, Elasticsearch looks through its inverted index to find documents matching the search words.
- Primary operation: Checking the posting lists for each search term and merging results.
- How many times: Once per search term, then combined across documents containing those terms.
As the number of documents grows, the posting lists for common words get longer, so more entries must be checked.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 documents | Few entries to check, very fast. |
| 1000 documents | More entries, takes longer but still quick. |
| 1,000,000 documents | Many entries, search takes noticeably more time. |
Pattern observation: The work grows roughly in proportion to how many documents contain the search terms.
Time Complexity: O(k + r)
This means the search time grows with the number of search terms (k) plus the number of matching documents (r).
[X] Wrong: "Searching always takes the same time no matter how many documents exist."
[OK] Correct: The search time depends on how many documents contain the search words, so more data usually means more work.
Understanding how search time grows helps you explain how search engines handle large data efficiently and why indexing matters.
What if we added a filter to only search documents from the last year? How would the time complexity change?