Partitioning best practices in PostgreSQL - Time & Space Complexity
When using partitioning in PostgreSQL, it is important to understand how query time grows as data increases.
We want to know how partitioning affects the speed of data access and management.
Analyze the time complexity of querying a partitioned table by range.
CREATE TABLE sales (
id SERIAL PRIMARY KEY,
sale_date DATE NOT NULL,
amount NUMERIC
) PARTITION BY RANGE (sale_date);
CREATE TABLE sales_2023 PARTITION OF sales
FOR VALUES FROM ('2023-01-01') TO ('2024-01-01');
SELECT * FROM sales WHERE sale_date = '2023-06-15';
This code creates a sales table partitioned by date ranges and queries a specific date.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: Scanning the relevant partition(s) for matching rows.
- How many times: Only the partitions that match the query condition are scanned, not all partitions.
As the total data grows, the query only touches the relevant partition, so work grows with partition size, not total data size.
| Input Size (n) | Approx. Operations |
|---|---|
| 10,000 rows total | Scan ~1,000 rows in one partition |
| 100,000 rows total | Scan ~10,000 rows in one partition |
| 1,000,000 rows total | Scan ~100,000 rows in one partition |
Pattern observation: Query cost grows with the size of the accessed partition, not the whole table.
Time Complexity: O(k)
This means query time grows with the size of the relevant partition (k), not the entire dataset.
[X] Wrong: "Partitioning always makes queries run in constant time regardless of data size."
[OK] Correct: Partitioning limits the data scanned but query time still grows with the size of the accessed partition.
Understanding partitioning time complexity shows you can design databases that handle large data efficiently, a valuable skill in real projects.
"What if we changed from range partitioning to list partitioning on a low-cardinality column? How would the time complexity change?"