Lens for drag-and-drop analysis in Elasticsearch - Time & Space Complexity
When using Lens in Elasticsearch for drag-and-drop analysis, it is important to understand how the time to process queries changes as the data or query complexity grows.
We want to know how the system handles bigger data or more complex drag-and-drop configurations.
Analyze the time complexity of the following Lens drag-and-drop query.
POST /my-index/_search
{
"size": 0,
"aggs": {
"drag_drop_analysis": {
"terms": { "field": "category.keyword", "size": 10 },
"aggs": {
"average_price": { "avg": { "field": "price" } }
}
}
}
}
This query groups documents by category and calculates the average price for each group, similar to what Lens does when you drag a field to create a visualization.
Look for repeated work inside the query.
- Primary operation: Elasticsearch groups documents by the "category.keyword" field using a terms aggregation.
- How many times: It processes each document once to assign it to a category bucket, then calculates the average price per bucket.
As the number of documents grows, Elasticsearch must check each document to place it in the right category bucket.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | About 10 document checks and bucket assignments |
| 100 | About 100 document checks and bucket assignments |
| 1000 | About 1000 document checks and bucket assignments |
Pattern observation: The work grows roughly in direct proportion to the number of documents.
Time Complexity: O(n)
This means the time to complete the query grows linearly with the number of documents processed.
[X] Wrong: "The aggregation runs faster because it only returns 10 buckets, so it doesn't depend on the total documents."
[OK] Correct: Even if only 10 buckets are returned, Elasticsearch still scans all documents to assign them to buckets before limiting the output.
Understanding how aggregations scale helps you explain performance in real projects and shows you can reason about data processing costs clearly.
"What if we added a filter to reduce documents before aggregation? How would the time complexity change?"