0
0
MysqlConceptBeginner · 4 min read

Histogram in MySQL 8: What It Is and How It Works

In MySQL 8, a histogram is a statistical data structure that describes the distribution of values in a column. It helps the query optimizer make better decisions by understanding how data is spread, improving query performance especially for columns with uneven data distribution.
⚙️

How It Works

A histogram in MySQL 8 works like a summary of how values are spread in a column, similar to how a bar chart shows the frequency of different ranges of numbers. Instead of storing every value, it groups values into buckets and counts how many fall into each bucket.

Think of it like sorting a jar of mixed candies by color into separate piles and counting each pile. This helps MySQL guess how many rows match a condition without scanning the whole table. The query optimizer uses this information to choose the fastest way to get results.

💻

Example

This example shows how to create a histogram on a column and then check its statistics.

sql
CREATE TABLE sales (
  id INT PRIMARY KEY,
  amount INT
);

INSERT INTO sales (id, amount) VALUES
(1, 10), (2, 20), (3, 20), (4, 30), (5, 40), (6, 40), (7, 40), (8, 50);

ANALYZE TABLE sales UPDATE HISTOGRAM ON amount WITH 4 BUCKETS;

SELECT * FROM information_schema.column_statistics WHERE table_name = 'sales' AND column_name = 'amount';
Output
TABLE_NAME: sales COLUMN_NAME: amount HISTOGRAM: [{"bucket_number":1,"bucket_lower_bound":10,"bucket_upper_bound":20,"bucket_count":3}, {"bucket_number":2,"bucket_lower_bound":20,"bucket_upper_bound":30,"bucket_count":1}, {"bucket_number":3,"bucket_lower_bound":30,"bucket_upper_bound":40,"bucket_count":3}, {"bucket_number":4,"bucket_lower_bound":40,"bucket_upper_bound":50,"bucket_count":1}]
🎯

When to Use

Use histograms in MySQL 8 when you have columns with uneven or skewed data distribution, like sales amounts, ages, or ratings. They help the optimizer understand data better than simple statistics like averages or counts.

This is especially useful for queries with range conditions (e.g., WHERE amount > 20 AND amount < 40) where knowing the data spread helps pick the best index or join method. Histograms improve performance by reducing unnecessary data scanning.

Key Points

  • Histograms summarize data distribution in a column using buckets.
  • They help the query optimizer make smarter decisions for faster queries.
  • Created with ANALYZE TABLE ... UPDATE HISTOGRAM command.
  • Best for columns with uneven or skewed data.
  • Available starting in MySQL 8.0.

Key Takeaways

Histograms in MySQL 8 describe how column values are distributed using buckets.
They improve query performance by helping the optimizer estimate row counts more accurately.
Create histograms with the ANALYZE TABLE statement specifying buckets.
Use histograms for columns with uneven data to optimize range queries.
Histograms are a feature introduced in MySQL 8 to enhance statistics.