0
0
ElasticsearchHow-ToIntermediate · 4 min read

How to Optimize Aggregation Performance in Elasticsearch

To optimize aggregation performance in Elasticsearch, use filter aggregations to reduce data early, enable doc_values for faster access, and avoid heavy nested aggregations. Also, leverage shard_size and execution_hint settings to control resource use and speed up aggregation execution.
📐

Syntax

Elasticsearch aggregation syntax includes specifying the aggregation type, the field to aggregate on, and optional parameters to control performance.

Key parts:

  • aggs: Defines the aggregation block.
  • terms: Common aggregation type to group by field values.
  • field: The field to aggregate.
  • size: Limits number of buckets returned.
  • shard_size: Controls how many buckets each shard returns before merging.
  • execution_hint: Suggests how Elasticsearch executes the aggregation for speed.
json
{
  "aggs": {
    "example_agg": {
      "terms": {
        "field": "field_name",
        "size": 10,
        "shard_size": 20,
        "execution_hint": "map"
      }
    }
  }
}
💻

Example

This example shows how to optimize a terms aggregation by setting shard_size and using a filter aggregation to reduce data before aggregation.

json
{
  "query": {
    "bool": {
      "filter": [
        { "term": { "status": "active" } }
      ]
    }
  },
  "aggs": {
    "active_users": {
      "terms": {
        "field": "user.keyword",
        "size": 5,
        "shard_size": 10,
        "execution_hint": "map"
      }
    }
  }
}
Output
{ "aggregations": { "active_users": { "buckets": [ { "key": "alice", "doc_count": 15 }, { "key": "bob", "doc_count": 12 }, { "key": "carol", "doc_count": 9 }, { "key": "dave", "doc_count": 7 }, { "key": "eve", "doc_count": 5 } ] } } }
⚠️

Common Pitfalls

Common mistakes that hurt aggregation performance include:

  • Aggregating on text fields without keyword or doc_values enabled.
  • Using very large size or shard_size values causing high memory use.
  • Running nested aggregations on large datasets without filters to reduce data.
  • Not using execution_hint to guide Elasticsearch on aggregation strategy.

Always check your mapping and use filters to limit data before aggregation.

json
{
  "aggs": {
    "bad_agg": {
      "terms": {
        "field": "text_field",
        "size": 10000
      }
    }
  }
}

// Better approach:
{
  "aggs": {
    "good_agg": {
      "terms": {
        "field": "text_field.keyword",
        "size": 100,
        "execution_hint": "map"
      }
    }
  }
}
📊

Quick Reference

TipDescription
Use filters before aggregationsReduce data size early to speed up aggregation.
Enable doc_valuesEnsure fields used in aggregations have doc_values enabled for fast access.
Limit size and shard_sizeKeep bucket counts reasonable to avoid memory issues.
Use execution_hintGuide Elasticsearch to use the fastest aggregation method.
Avoid aggregating on analyzed textUse keyword or numeric fields instead for aggregations.

Key Takeaways

Apply filters to reduce data before running aggregations for better speed.
Use fields with doc_values enabled to improve aggregation performance.
Control bucket sizes with size and shard_size to manage memory use.
Set execution_hint to optimize how Elasticsearch processes aggregations.
Avoid aggregating on analyzed text fields; use keyword or numeric fields instead.