What is hash partitioning in postgresql

PostgresqlConceptBeginner · 4 min read

Hash Partitioning in PostgreSQL: What It Is and How It Works

In PostgreSQL, hash partitioning is a method to divide a large table into smaller pieces called partitions based on a hash function applied to a column's value. This helps distribute data evenly across partitions for faster queries and better management.

⚙️

How It Works

Hash partitioning in PostgreSQL works by applying a hash function to the values of a chosen column. Imagine you have a big box of colored balls and you want to sort them into smaller boxes so that each box has a balanced mix. The hash function acts like a sorter that decides which smaller box each ball goes into based on its color.

When you insert data, PostgreSQL calculates the hash of the partition key's value and uses the result to decide which partition the row belongs to. This spreads the data evenly, avoiding overloaded partitions. When you query the table, PostgreSQL can quickly find the right partition by applying the same hash function, making data retrieval faster.

💻

Example

This example shows how to create a hash partitioned table in PostgreSQL with 4 partitions based on the user_id column.

sql

CREATE TABLE users (
  user_id INT,
  username TEXT,
  email TEXT
) PARTITION BY HASH (user_id);

CREATE TABLE users_part_0 PARTITION OF users FOR VALUES WITH (MODULUS 4, REMAINDER 0);
CREATE TABLE users_part_1 PARTITION OF users FOR VALUES WITH (MODULUS 4, REMAINDER 1);
CREATE TABLE users_part_2 PARTITION OF users FOR VALUES WITH (MODULUS 4, REMAINDER 2);
CREATE TABLE users_part_3 PARTITION OF users FOR VALUES WITH (MODULUS 4, REMAINDER 3);

INSERT INTO users VALUES (1, 'alice', 'alice@example.com');
INSERT INTO users VALUES (2, 'bob', 'bob@example.com');
INSERT INTO users VALUES (3, 'carol', 'carol@example.com');
INSERT INTO users VALUES (4, 'dave', 'dave@example.com');

SELECT tableoid::regclass AS partition, * FROM users ORDER BY user_id;

Output

🎯

When to Use

Use hash partitioning when you want to evenly distribute data across partitions without relying on ranges or lists. It is especially useful when the partition key has many distinct values and you want to avoid hotspots where some partitions get much more data than others.

For example, if you have a large user table and want to split it into smaller parts for better performance and maintenance, hash partitioning on user_id can balance the data. It also helps when queries filter by the partition key, allowing PostgreSQL to skip irrelevant partitions.

✅

Key Points

Hash partitioning uses a hash function on a column to assign rows to partitions.
It evenly distributes data to avoid unbalanced partitions.
Partitions are defined by modulus and remainder values.
Improves query performance by pruning partitions during scans.
Best for columns with many unique values and no natural range.

✅

Key Takeaways

Hash partitioning splits a table into balanced parts using a hash function on a column.

It helps improve performance by evenly distributing data and enabling partition pruning.

Use it when your partition key has many distinct values without natural ranges.

Partitions are created with modulus and remainder to define data placement.

Queries filtering by the partition key benefit most from hash partitioning.