How to Use Composite Aggregation in Elasticsearch
Use
composite aggregation in Elasticsearch to group data by multiple fields and paginate results efficiently. Define sources for grouping keys and use after to fetch subsequent pages of aggregation buckets.Syntax
The composite aggregation groups documents by multiple fields defined in sources. You can paginate results using the after key to continue from the last bucket.
- sources: An array of objects defining fields or scripts to group by.
- size: Number of buckets to return per page.
- after: Optional key to paginate from the last bucket.
json
{
"aggs": {
"my_buckets": {
"composite": {
"size": 10,
"sources": [
{ "field1": { "terms": { "field": "field1.keyword" } } },
{ "field2": { "terms": { "field": "field2.keyword" } } }
],
"after": { "field1": "value1", "field2": "value2" }
}
}
}
}Example
This example shows how to group documents by category.keyword and brand.keyword, returning 5 buckets per page. It also demonstrates how to paginate using the after key from the previous response.
json
{
"size": 0,
"aggs": {
"products_by_category_and_brand": {
"composite": {
"size": 5,
"sources": [
{ "category": { "terms": { "field": "category.keyword" } } },
{ "brand": { "terms": { "field": "brand.keyword" } } }
]
}
}
}
}
// To paginate, use the "after" key from the response:
{
"size": 0,
"aggs": {
"products_by_category_and_brand": {
"composite": {
"size": 5,
"sources": [
{ "category": { "terms": { "field": "category.keyword" } } },
{ "brand": { "terms": { "field": "brand.keyword" } } }
],
"after": { "category": "electronics", "brand": "sony" }
}
}
}
}Output
{
"aggregations": {
"products_by_category_and_brand": {
"buckets": [
{ "key": { "category": "books", "brand": "penguin" }, "doc_count": 12 },
{ "key": { "category": "books", "brand": "harpercollins" }, "doc_count": 8 },
{ "key": { "category": "electronics", "brand": "apple" }, "doc_count": 15 },
{ "key": { "category": "electronics", "brand": "samsung" }, "doc_count": 10 },
{ "key": { "category": "electronics", "brand": "sony" }, "doc_count": 7 }
],
"after_key": { "category": "electronics", "brand": "sony" }
}
}
}
Common Pitfalls
- Not using
keywordfields for terms aggregation causes errors or unexpected results. - Forgetting to use the
afterkey to paginate leads to repeated buckets. - Setting
sizetoo high can cause performance issues. - Composite aggregation does not support ordering by metrics; it only orders by the composite keys.
json
{
"aggs": {
"wrong_usage": {
"composite": {
"size": 10000,
"sources": [
{ "field": { "terms": { "field": "text_field" } } }
]
}
}
}
}
// Correct usage:
{
"aggs": {
"correct_usage": {
"composite": {
"size": 1000,
"sources": [
{ "field": { "terms": { "field": "text_field.keyword" } } }
]
}
}
}
}Quick Reference
- sources: List of fields to group by, each with a unique name.
- size: Number of buckets per page (default 10).
- after: Key to paginate from last bucket.
- Use
.keywordfields for exact terms. - Composite aggregation is good for deep pagination of aggregations.
Key Takeaways
Composite aggregation groups data by multiple fields and supports efficient pagination.
Always use keyword fields for terms in composite aggregation to avoid errors.
Use the after key from the response to fetch the next page of buckets.
Avoid setting size too high to maintain good performance.
Composite aggregation orders buckets by keys, not by metric values.