Bool query (must, should, must_not, filter) in Elasticsearch - Time & Space Complexity
When using a bool query in Elasticsearch, it's important to know how the search time changes as we add more conditions.
We want to understand how the number of conditions affects the work Elasticsearch does.
Analyze the time complexity of the following bool query.
{
"query": {
"bool": {
"must": [ { "match": { "title": "search" } } ],
"should": [ { "term": { "status": "published" } } ],
"must_not": [ { "range": { "date": { "lt": "2023-01-01" } } } ],
"filter": [ { "term": { "category": "books" } } ]
}
}
}
This query combines several conditions to find documents matching all must, some should, excluding must_not, and applying filters.
Look at how many times Elasticsearch checks each condition.
- Primary operation: Checking each document against all conditions in must, should, must_not, and filter.
- How many times: Once per document for each condition group.
As the number of documents grows, Elasticsearch checks each document against the bool query conditions.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | About 10 document checks |
| 100 | About 100 document checks |
| 1000 | About 1000 document checks |
Pattern observation: The work grows roughly in direct proportion to the number of documents.
Time Complexity: O(n)
This means the time to run the query grows linearly with the number of documents to check.
[X] Wrong: "Adding more conditions in must, should, or filter does not affect query time much."
[OK] Correct: Each added condition means more checks per document, so the total work increases with more conditions.
Understanding how bool queries scale helps you explain how search engines handle complex filters efficiently.
What if we added nested bool queries inside must? How would the time complexity change?