Application performance monitoring in Elasticsearch - Time & Space Complexity
Start learning this pattern below
Jump into concepts and practice - no test required
When monitoring application performance using Elasticsearch, we want to know how the time to process data grows as more performance data comes in.
We ask: How does the search and aggregation time change when the amount of monitoring data increases?
Analyze the time complexity of the following Elasticsearch query used for application performance monitoring.
GET /apm-data/_search
{
"size": 0,
"query": {
"range": { "timestamp": { "gte": "now-1h" } }
},
"aggs": {
"avg_response_time": { "avg": { "field": "response_time" } }
}
}
This query finds the average response time of application requests in the last hour.
Look at what repeats as data grows.
- Primary operation: Elasticsearch scans all matching documents in the time range.
- How many times: Once per document in the last hour.
As the number of documents in the last hour grows, the query must process more data.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | Processes 10 documents |
| 100 | Processes 100 documents |
| 1000 | Processes 1000 documents |
Pattern observation: The work grows directly with the number of documents matching the time range.
Time Complexity: O(n)
This means the query time grows linearly with the number of documents in the selected time range.
[X] Wrong: "The aggregation runs instantly no matter how much data there is."
[OK] Correct: The aggregation must look at each matching document, so more data means more work and longer time.
Understanding how query time grows with data size helps you design better monitoring and alerting systems that stay fast as data grows.
What if we added a filter to only include error responses? How would the time complexity change?
Practice
Solution
Step 1: Understand APM's role
APM is designed to monitor how fast an application runs and to find any errors it produces.Step 2: Match purpose with options
Only To track application speed and detect errors describes tracking speed and errors, which fits APM's main goal.Final Answer:
To track application speed and detect errors -> Option AQuick Check:
APM purpose = Track speed and errors [OK]
- Confusing APM with security or backup tools
- Thinking APM manages cluster nodes
- Assuming APM stores user credentials
Solution
Step 1: Identify aggregation for average
The query uses "avg" aggregation on the field "transaction.duration.us" which stores response times in microseconds.Step 2: Confirm query structure
Size is 0 to avoid returning documents, focusing only on aggregation results, which is correct for average calculation.Final Answer:
GET /apm-*/_search {"size":0, "aggs": {"avg_response_time": {"avg": {"field": "transaction.duration.us"}}}} -> Option CQuick Check:
Average aggregation query = GET /apm-*/_search {"size":0, "aggs": {"avg_response_time": {"avg": {"field": "transaction.duration.us"}}}} [OK]
- Using match_all without aggregation
- Using update_by_query instead of search
- Using max aggregation instead of avg
GET /apm-*/_search
{
"size": 0,
"aggs": {
"avg_response_time": {
"avg": { "field": "transaction.duration.us" }
}
}
}Solution
Step 1: Understand aggregation type
The query requests the average of the field "transaction.duration.us" which holds response times in microseconds.Step 2: Match output to aggregation
The output shows an aggregation named "avg_response_time" with a numeric value representing the average, matching {"aggregations":{"avg_response_time":{"value":250000}}}.Final Answer:
{"aggregations":{"avg_response_time":{"value":250000}}} -> Option DQuick Check:
Average aggregation output = {"aggregations":{"avg_response_time":{"value":250000}}} [OK]
- Confusing hits total with aggregation result
- Expecting max instead of avg
- Assuming error without checking field existence
Fielddata is disabled on text fields by default. What is the likely cause?Solution
Step 1: Analyze error message
The error says fielddata is disabled on text fields, which means aggregation was attempted on a text field.Step 2: Understand aggregation requirements
Aggregations like average require numeric fields, so using a text field causes this error.Final Answer:
Trying to aggregate on a text field instead of a numeric field -> Option AQuick Check:
Fielddata error = Aggregation on text field [OK]
- Blaming index pattern or auth for this error
- Assuming syntax error without checking field type
- Ignoring field data type requirements
Solution
Step 1: Identify filter for transactions with errors
Transactions with errors have a non-empty "error.id" field, so we use "exists" query on "error.id".Step 2: Confirm aggregation on filtered data
The aggregation calculates average response time only on filtered documents, which is correct.Final Answer:
{ "size": 0, "query": { "exists": { "field": "error.id" } }, "aggs": { "avg_response_time": { "avg": { "field": "transaction.duration.us" } } } } -> Option BQuick Check:
Filter errors with exists + avg aggregation = { "size": 0, "query": { "exists": { "field": "error.id" } }, "aggs": { "avg_response_time": { "avg": { "field": "transaction.duration.us" } } } } [OK]
- Using empty term query instead of exists
- Calculating average without filtering errors
- Filtering for success instead of errors
