Inverted index data structure in Elasticsearch - Time & Space Complexity
When searching text in Elasticsearch, the inverted index helps find words quickly.
We want to know how the search time changes as the amount of text grows.
Analyze the time complexity of building and searching an inverted index.
{
"mappings": {
"properties": {
"content": { "type": "text" }
}
}
}
GET /my_index/_search
{
"query": {
"match": { "content": "search term" }
}
}
This code defines a text field that Elasticsearch indexes using an inverted index, then searches for documents containing "search term".
Look at what repeats when building and searching the inverted index.
- Primary operation: Scanning each word in all documents to build the index.
- How many times: Once per word per document when indexing; then searching looks up words in the index.
As you add more documents or words, indexing takes longer because it reads all words.
| Input Size (n words) | Approx. Operations |
|---|---|
| 10 | About 10 word scans |
| 100 | About 100 word scans |
| 1000 | About 1000 word scans |
Pattern observation: The work grows directly with the number of words; doubling words doubles work.
Time Complexity: O(n)
This means the time to build the index grows in a straight line with the number of words.
[X] Wrong: "Searching the inverted index takes as long as reading all documents again."
[OK] Correct: The inverted index lets Elasticsearch jump directly to matching words, so search is much faster than scanning all documents.
Understanding how inverted indexes scale helps you explain efficient search systems clearly and confidently.
What if we added a second word to search for? How would the time complexity change?