Synonym handling in Elasticsearch - Time & Space Complexity
When Elasticsearch processes synonyms, it changes how many words it looks at during a search.
We want to know how adding synonyms affects the work Elasticsearch does as the input grows.
Analyze the time complexity of the following synonym filter in Elasticsearch.
{
"settings": {
"analysis": {
"filter": {
"synonym_filter": {
"type": "synonym",
"synonyms": ["quick,fast", "jumps,leaps"]
}
},
"analyzer": {
"synonym_analyzer": {
"tokenizer": "standard",
"filter": ["lowercase", "synonym_filter"]
}
}
}
}
}
This code defines a synonym filter that replaces words with their synonyms during analysis.
Look at what repeats when analyzing text with synonyms.
- Primary operation: For each token (word), check if it matches any synonym group.
- How many times: Once per token in the input text.
As the number of words (tokens) grows, the synonym filter checks each one against the synonym list.
| Input Size (n tokens) | Approx. Operations |
|---|---|
| 10 | 10 synonym checks |
| 100 | 100 synonym checks |
| 1000 | 1000 synonym checks |
Pattern observation: The work grows directly with the number of tokens; more words mean more checks.
Time Complexity: O(n)
This means the time to process synonyms grows in a straight line with the number of words analyzed.
[X] Wrong: "Adding synonyms makes the search time explode exponentially."
[OK] Correct: Actually, each word is checked once against the synonym list, so time grows linearly, not exponentially.
Understanding how synonym handling scales helps you explain search performance clearly and confidently.
What if the synonym list grew from 10 to 10,000 entries? How would that affect the time complexity?