Filter aggregation in Elasticsearch - Time & Space Complexity
When using filter aggregation in Elasticsearch, we want to know how the time to get results changes as the data grows.
We ask: How does the number of documents affect the work Elasticsearch does to apply the filter?
Analyze the time complexity of the following code snippet.
{
"aggs": {
"filtered_data": {
"filter": {
"term": { "status": "active" }
}
}
}
}
This snippet filters documents where the field status equals active and aggregates the count.
Look for repeated work Elasticsearch does when applying the filter.
- Primary operation: Checking each document's
statusfield against the filter value. - How many times: Once for every document in the index.
As the number of documents grows, Elasticsearch checks more documents to find matches.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | 10 checks |
| 100 | 100 checks |
| 1000 | 1000 checks |
Pattern observation: The work grows directly with the number of documents.
Time Complexity: O(n)
This means the time to apply the filter grows in a straight line as the number of documents increases.
[X] Wrong: "Filter aggregation runs instantly no matter how many documents there are."
[OK] Correct: Elasticsearch must check each document to see if it matches the filter, so more documents mean more work.
Understanding how filters scale helps you explain how search engines handle large data efficiently and why indexing matters.
What if we added a second filter inside a bool must clause? How would the time complexity change?