How to Optimize Aggregation Performance in Elasticsearch
To optimize aggregation performance in Elasticsearch, use
filter aggregations to reduce data early, enable doc_values for faster access, and avoid heavy nested aggregations. Also, leverage shard_size and execution_hint settings to control resource use and speed up aggregation execution.Syntax
Elasticsearch aggregation syntax includes specifying the aggregation type, the field to aggregate on, and optional parameters to control performance.
Key parts:
aggs: Defines the aggregation block.terms: Common aggregation type to group by field values.field: The field to aggregate.size: Limits number of buckets returned.shard_size: Controls how many buckets each shard returns before merging.execution_hint: Suggests how Elasticsearch executes the aggregation for speed.
json
{
"aggs": {
"example_agg": {
"terms": {
"field": "field_name",
"size": 10,
"shard_size": 20,
"execution_hint": "map"
}
}
}
}Example
This example shows how to optimize a terms aggregation by setting shard_size and using a filter aggregation to reduce data before aggregation.
json
{
"query": {
"bool": {
"filter": [
{ "term": { "status": "active" } }
]
}
},
"aggs": {
"active_users": {
"terms": {
"field": "user.keyword",
"size": 5,
"shard_size": 10,
"execution_hint": "map"
}
}
}
}Output
{
"aggregations": {
"active_users": {
"buckets": [
{ "key": "alice", "doc_count": 15 },
{ "key": "bob", "doc_count": 12 },
{ "key": "carol", "doc_count": 9 },
{ "key": "dave", "doc_count": 7 },
{ "key": "eve", "doc_count": 5 }
]
}
}
}
Common Pitfalls
Common mistakes that hurt aggregation performance include:
- Aggregating on text fields without
keywordordoc_valuesenabled. - Using very large
sizeorshard_sizevalues causing high memory use. - Running nested aggregations on large datasets without filters to reduce data.
- Not using
execution_hintto guide Elasticsearch on aggregation strategy.
Always check your mapping and use filters to limit data before aggregation.
json
{
"aggs": {
"bad_agg": {
"terms": {
"field": "text_field",
"size": 10000
}
}
}
}
// Better approach:
{
"aggs": {
"good_agg": {
"terms": {
"field": "text_field.keyword",
"size": 100,
"execution_hint": "map"
}
}
}
}Quick Reference
| Tip | Description |
|---|---|
| Use filters before aggregations | Reduce data size early to speed up aggregation. |
| Enable doc_values | Ensure fields used in aggregations have doc_values enabled for fast access. |
| Limit size and shard_size | Keep bucket counts reasonable to avoid memory issues. |
| Use execution_hint | Guide Elasticsearch to use the fastest aggregation method. |
| Avoid aggregating on analyzed text | Use keyword or numeric fields instead for aggregations. |
Key Takeaways
Apply filters to reduce data before running aggregations for better speed.
Use fields with doc_values enabled to improve aggregation performance.
Control bucket sizes with size and shard_size to manage memory use.
Set execution_hint to optimize how Elasticsearch processes aggregations.
Avoid aggregating on analyzed text fields; use keyword or numeric fields instead.