ElasticsearchHow-ToIntermediate · 4 min read

How to Optimize Aggregation Performance in Elasticsearch

To optimize aggregation performance in Elasticsearch, use filter aggregations to reduce data early, enable doc_values for faster access, and avoid heavy nested aggregations. Also, leverage shard_size and execution_hint settings to control resource use and speed up aggregation execution.

📐

Syntax

Elasticsearch aggregation syntax includes specifying the aggregation type, the field to aggregate on, and optional parameters to control performance.

Key parts:

aggs: Defines the aggregation block.
terms: Common aggregation type to group by field values.
field: The field to aggregate.
size: Limits number of buckets returned.
shard_size: Controls how many buckets each shard returns before merging.
execution_hint: Suggests how Elasticsearch executes the aggregation for speed.

json

{
  "aggs": {
    "example_agg": {
      "terms": {
        "field": "field_name",
        "size": 10,
        "shard_size": 20,
        "execution_hint": "map"
      }
    }
  }
}

💻

Example

This example shows how to optimize a terms aggregation by setting shard_size and using a filter aggregation to reduce data before aggregation.

json

{
  "query": {
    "bool": {
      "filter": [
        { "term": { "status": "active" } }
      ]
    }
  },
  "aggs": {
    "active_users": {
      "terms": {
        "field": "user.keyword",
        "size": 5,
        "shard_size": 10,
        "execution_hint": "map"
      }
    }
  }
}

Output

{ "aggregations": { "active_users": { "buckets": [ { "key": "alice", "doc_count": 15 }, { "key": "bob", "doc_count": 12 }, { "key": "carol", "doc_count": 9 }, { "key": "dave", "doc_count": 7 }, { "key": "eve", "doc_count": 5 } ] } } }

⚠️

Common Pitfalls

Common mistakes that hurt aggregation performance include:

Aggregating on text fields without keyword or doc_values enabled.
Using very large size or shard_size values causing high memory use.
Running nested aggregations on large datasets without filters to reduce data.
Not using execution_hint to guide Elasticsearch on aggregation strategy.

Always check your mapping and use filters to limit data before aggregation.

json

{
  "aggs": {
    "bad_agg": {
      "terms": {
        "field": "text_field",
        "size": 10000
      }
    }
  }
}

// Better approach:
{
  "aggs": {
    "good_agg": {
      "terms": {
        "field": "text_field.keyword",
        "size": 100,
        "execution_hint": "map"
      }
    }
  }
}

📊

Quick Reference

Tip	Description
Use filters before aggregations	Reduce data size early to speed up aggregation.
Enable doc_values	Ensure fields used in aggregations have doc_values enabled for fast access.
Limit size and shard_size	Keep bucket counts reasonable to avoid memory issues.
Use execution_hint	Guide Elasticsearch to use the fastest aggregation method.
Avoid aggregating on analyzed text	Use keyword or numeric fields instead for aggregations.

✅

Key Takeaways

Apply filters to reduce data before running aggregations for better speed.

Use fields with doc_values enabled to improve aggregation performance.

Control bucket sizes with size and shard_size to manage memory use.

Set execution_hint to optimize how Elasticsearch processes aggregations.

Avoid aggregating on analyzed text fields; use keyword or numeric fields instead.