Partitioning best practices in PostgreSQL - Time & Space Complexity
Start learning this pattern below
Jump into concepts and practice - no test required
When using partitioning in PostgreSQL, it is important to understand how query time grows as data increases.
We want to know how partitioning affects the speed of data access and management.
Analyze the time complexity of querying a partitioned table by range.
CREATE TABLE sales (
id SERIAL PRIMARY KEY,
sale_date DATE NOT NULL,
amount NUMERIC
) PARTITION BY RANGE (sale_date);
CREATE TABLE sales_2023 PARTITION OF sales
FOR VALUES FROM ('2023-01-01') TO ('2024-01-01');
SELECT * FROM sales WHERE sale_date = '2023-06-15';
This code creates a sales table partitioned by date ranges and queries a specific date.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: Scanning the relevant partition(s) for matching rows.
- How many times: Only the partitions that match the query condition are scanned, not all partitions.
As the total data grows, the query only touches the relevant partition, so work grows with partition size, not total data size.
| Input Size (n) | Approx. Operations |
|---|---|
| 10,000 rows total | Scan ~1,000 rows in one partition |
| 100,000 rows total | Scan ~10,000 rows in one partition |
| 1,000,000 rows total | Scan ~100,000 rows in one partition |
Pattern observation: Query cost grows with the size of the accessed partition, not the whole table.
Time Complexity: O(k)
This means query time grows with the size of the relevant partition (k), not the entire dataset.
[X] Wrong: "Partitioning always makes queries run in constant time regardless of data size."
[OK] Correct: Partitioning limits the data scanned but query time still grows with the size of the accessed partition.
Understanding partitioning time complexity shows you can design databases that handle large data efficiently, a valuable skill in real projects.
"What if we changed from range partitioning to list partitioning on a low-cardinality column? How would the time complexity change?"
Practice
Solution
Step 1: Understand what partitioning does
Partitioning divides a big table into smaller pieces called partitions.Step 2: Identify the benefit of smaller parts
Smaller parts make queries faster and data easier to manage.Final Answer:
It breaks a large table into smaller, manageable parts to improve performance. -> Option AQuick Check:
Partitioning = smaller parts for performance [OK]
- Thinking partitioning creates backups
- Confusing partitioning with encryption
- Believing partitioning merges tables
Solution
Step 1: Recall correct partition syntax
PostgreSQL uses PARTITION BY RANGE (column) after table columns.Step 2: Check each option
CREATE TABLE sales (id INT, sale_date DATE) PARTITION BY RANGE (sale_date); matches syntax: columns first, then PARTITION BY RANGE.Final Answer:
CREATE TABLE sales (id INT, sale_date DATE) PARTITION BY RANGE (sale_date); -> Option DQuick Check:
PARTITION BY RANGE after columns = correct syntax [OK]
- Placing PARTITION BY before columns
- Using PARTITION ON instead of PARTITION BY
- Confusing RANGE with LIST partition type
CREATE TABLE orders (id INT, region TEXT, order_date DATE) PARTITION BY LIST (region);
CREATE TABLE orders_us PARTITION OF orders FOR VALUES IN ('US');
CREATE TABLE orders_eu PARTITION OF orders FOR VALUES IN ('EU');What will be the result of this query?
INSERT INTO orders VALUES (1, 'US', '2024-01-01'); SELECT * FROM orders WHERE region = 'US';
Solution
Step 1: Understand LIST partitioning by region
Rows with region 'US' go to orders_us partition.Step 2: Insert and select behavior
Insert puts row in orders_us; select filters region='US', so row is returned.Final Answer:
Returns the row with id 1, region 'US', and date '2024-01-01'. -> Option AQuick Check:
List partition returns matching rows [OK]
- Assuming insert fails without default partition
- Expecting syntax error on insert
- Thinking select ignores partition keys
CREATE TABLE logs (id SERIAL, log_date DATE) PARTITION BY RANGE (log_date);
CREATE TABLE logs_2023 PARTITION OF logs FOR VALUES FROM ('2023-01-01') TO ('2024-01-01');Which error will occur if you run this?
INSERT INTO logs (log_date) VALUES ('2022-12-31');Solution
Step 1: Check partition ranges
logs_2023 covers dates from 2023-01-01 to 2024-01-01 only.Step 2: Insert date outside partition range
2022-12-31 is before 2023-01-01, so no matching partition exists.Final Answer:
ERROR: no partition found for row -> Option BQuick Check:
Insert outside range = no partition error [OK]
- Expecting automatic default partition
- Thinking syntax error occurs
- Assuming row goes to nearest partition
Solution
Step 1: Analyze query filters
Queries filter by user_id and created_at, so both should guide partitioning.Step 2: Choose partitioning methods
RANGE on created_at handles date ranges well; HASH subpartitioning on user_id balances data.Step 3: Evaluate other options
LIST on user_id alone is inefficient for many users; HASH on created_at is unusual; no partitioning misses benefits.Final Answer:
Use RANGE partitioning on created_at and subpartition by HASH on user_id. -> Option CQuick Check:
Combine RANGE and HASH for multi-column filtering [OK]
- Using LIST for high-cardinality user_id
- Hash partitioning on date column only
- Skipping partitioning on large tables
