Completion suggester in Elasticsearch - Time & Space Complexity
When using the completion suggester in Elasticsearch, it's important to understand how the time to find suggestions changes as the number of indexed entries grows.
We want to know how the search speed changes when we add more data to the suggester.
Analyze the time complexity of the following Elasticsearch completion suggester query.
POST /products/_search
{
"suggest": {
"product-suggest": {
"prefix": "iph",
"completion": {
"field": "name_suggest"
}
}
}
}
This query asks Elasticsearch to suggest product names starting with "iph" using the completion suggester on the "name_suggest" field.
In the completion suggester, the main repeating operation is traversing the prefix tree (trie) built from all indexed suggestions.
- Primary operation: Navigating the trie nodes matching the prefix characters.
- How many times: Once per character in the prefix, not per total entries.
The time to find suggestions grows mostly with the length of the prefix typed, not the total number of suggestions stored.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 suggestions | Few steps, depends on prefix length |
| 1000 suggestions | Still about the same steps, depends on prefix length |
| 100,000 suggestions | Still about the same steps, depends on prefix length |
Pattern observation: The search time stays roughly the same as more suggestions are added, because it depends on prefix length, not total suggestions.
Time Complexity: O(k)
This means the time to get suggestions grows linearly with the length of the prefix typed, not with the total number of suggestions stored.
[X] Wrong: "The more suggestions stored, the slower the completion suggester will be."
[OK] Correct: The completion suggester uses a prefix tree, so it jumps directly to the prefix node. The total number of suggestions does not slow down the search much.
Understanding how the completion suggester scales helps you explain efficient search features in real applications, showing you know how data structures affect performance.
What if we changed the prefix to a very common short string? How would that affect the time complexity and the number of results returned?