Boolean and binary types in Elasticsearch - Time & Space Complexity
When working with Boolean and binary types in Elasticsearch, it's important to understand how the time to process queries grows as data size increases.
We want to know how the cost of searching or filtering on these types changes when we have more documents.
Analyze the time complexity of the following Elasticsearch query filtering on a Boolean field.
GET /my_index/_search
{
"query": {
"term": {
"is_active": true
}
}
}
This query searches for documents where the Boolean field is_active is true.
In this query, Elasticsearch checks each document's is_active field to see if it matches true.
- Primary operation: Checking the Boolean field value for each document.
- How many times: Once per document in the index.
As the number of documents grows, Elasticsearch must check more Boolean values.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | 10 checks |
| 100 | 100 checks |
| 1000 | 1000 checks |
Pattern observation: The number of checks grows directly with the number of documents.
Time Complexity: O(n)
This means the time to filter documents grows in a straight line as the number of documents increases.
[X] Wrong: "Filtering on Boolean fields is instant no matter how many documents there are."
[OK] Correct: Even though Boolean fields have only two values, Elasticsearch uses inverted indexes to quickly find matching documents, so filtering is generally faster than checking each document individually.
Understanding how simple filters like Boolean checks scale helps you explain how search engines handle large data efficiently and shows your grasp of query performance.
"What if we changed the Boolean filter to a binary field filter? How would the time complexity change?"