Why Kibana visualizes Elasticsearch data - Performance Analysis
We want to understand how the time needed to visualize data in Kibana changes as the amount of Elasticsearch data grows.
How does the size of data affect the speed of creating visualizations?
Analyze the time complexity of this Elasticsearch aggregation query used by Kibana:
GET /logs/_search
{
"size": 0,
"aggs": {
"errors_over_time": {
"date_histogram": {
"field": "timestamp",
"fixed_interval": "1h"
},
"aggs": {
"error_count": { "terms": { "field": "error.keyword" } }
}
}
}
}
This query groups log data by hour and counts error types for each hour.
Look at what repeats when Elasticsearch runs this query:
- Primary operation: Scanning all log entries in the time range to group them by hour.
- How many times: Once for each log entry, plus grouping and counting for each hour bucket.
As the number of logs grows, the work to group and count them grows too.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 logs | About 10 scans and groupings |
| 100 logs | About 100 scans and groupings |
| 1000 logs | About 1000 scans and groupings |
Pattern observation: The work grows roughly in direct proportion to the number of logs.
Time Complexity: O(n)
This means the time to visualize data grows linearly with the number of log entries.
[X] Wrong: "Kibana visualization time stays the same no matter how much data there is."
[OK] Correct: More data means more entries to scan and group, so it takes more time.
Understanding how data size affects query time helps you explain performance in real projects and shows you can think about scaling data tools.
"What if we added a filter to only look at errors from the last day? How would the time complexity change?"