0
0
Elasticsearchquery~3 mins

Why Cardinality aggregation in Elasticsearch? - Purpose & Use Cases

Choose your learning style9 modes available
The Big Idea

What if you could count millions of unique visitors in seconds without breaking a sweat?

The Scenario

Imagine you have a huge list of customer visits to a store, and you want to count how many unique customers came in. Doing this by checking each record one by one and remembering every customer manually is like trying to count every unique face in a busy crowd without any help.

The Problem

Manually tracking unique items in large data is slow and uses a lot of memory. It's easy to make mistakes, miss some entries, or double count. When data grows, this method becomes impossible to manage efficiently.

The Solution

Cardinality aggregation in Elasticsearch quickly counts unique values using smart algorithms. It handles huge data sets efficiently without needing to store every single item, saving time and memory.

Before vs After
Before
Loop through all records, add each customer ID to a list if not already there, then count the list length.
After
{
  "aggs": {
    "unique_customers": {
      "cardinality": {
        "field": "customer_id.keyword"
      }
    }
  }
}
What It Enables

It lets you quickly find the number of unique items in massive data sets without slowing down your system.

Real Life Example

A store owner wants to know how many different customers visited last month to understand their reach, without manually checking millions of sales records.

Key Takeaways

Manual counting of unique items is slow and error-prone.

Cardinality aggregation efficiently estimates unique counts in big data.

This method saves time and memory, making analysis faster and easier.