Application performance monitoring in Elasticsearch - Time & Space Complexity
When monitoring application performance using Elasticsearch, we want to know how the time to process data grows as more performance data comes in.
We ask: How does the search and aggregation time change when the amount of monitoring data increases?
Analyze the time complexity of the following Elasticsearch query used for application performance monitoring.
GET /apm-data/_search
{
"size": 0,
"query": {
"range": { "timestamp": { "gte": "now-1h" } }
},
"aggs": {
"avg_response_time": { "avg": { "field": "response_time" } }
}
}
This query finds the average response time of application requests in the last hour.
Look at what repeats as data grows.
- Primary operation: Elasticsearch scans all matching documents in the time range.
- How many times: Once per document in the last hour.
As the number of documents in the last hour grows, the query must process more data.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | Processes 10 documents |
| 100 | Processes 100 documents |
| 1000 | Processes 1000 documents |
Pattern observation: The work grows directly with the number of documents matching the time range.
Time Complexity: O(n)
This means the query time grows linearly with the number of documents in the selected time range.
[X] Wrong: "The aggregation runs instantly no matter how much data there is."
[OK] Correct: The aggregation must look at each matching document, so more data means more work and longer time.
Understanding how query time grows with data size helps you design better monitoring and alerting systems that stay fast as data grows.
What if we added a filter to only include error responses? How would the time complexity change?