How to Use DISTINCT ON in PostgreSQL: Syntax and Examples
In PostgreSQL, use
DISTINCT ON (column_name) to return the first row for each unique value in the specified column(s). It requires an ORDER BY clause to determine which row to keep for each distinct group.Syntax
The DISTINCT ON clause selects the first row of each set of rows where the specified columns have the same values. It must be followed by an ORDER BY clause that defines the order of rows within each group.
Parts explained:
DISTINCT ON (column1, column2, ...): Columns to find unique groups.ORDER BY column1, column2, ...: Defines which row to pick first in each group.
sql
SELECT DISTINCT ON (column_name) column_name, other_columns FROM table_name ORDER BY column_name, other_columns;
Example
This example shows how to select the first order for each customer based on the earliest order date.
sql
CREATE TABLE orders ( id SERIAL PRIMARY KEY, customer_id INT, order_date DATE ); INSERT INTO orders (customer_id, order_date) VALUES (1, '2024-01-10'), (1, '2024-02-15'), (2, '2024-01-20'), (2, '2024-01-25'), (3, '2024-03-01'); SELECT DISTINCT ON (customer_id) customer_id, order_date FROM orders ORDER BY customer_id, order_date;
Output
customer_id | order_date
-------------+------------
1 | 2024-01-10
2 | 2024-01-20
3 | 2024-03-01
(3 rows)
Common Pitfalls
Common mistakes when using DISTINCT ON include:
- Not including an
ORDER BYclause, which causes an error. - Ordering by columns not in
DISTINCT ONcan lead to unexpected results. - Expecting
DISTINCT ONto behave likeDISTINCT(which removes all duplicates).
Correct usage requires ORDER BY starting with the same columns as DISTINCT ON.
sql
/* Wrong: Missing ORDER BY */ -- SELECT DISTINCT ON (customer_id) customer_id, order_date FROM orders; /* Wrong: ORDER BY columns do not start with DISTINCT ON columns */ -- SELECT DISTINCT ON (customer_id) customer_id, order_date FROM orders ORDER BY order_date; /* Right: ORDER BY starts with DISTINCT ON columns */ SELECT DISTINCT ON (customer_id) customer_id, order_date FROM orders ORDER BY customer_id, order_date;
Quick Reference
| Clause | Description |
|---|---|
| DISTINCT ON (columns) | Selects the first row for each unique group of columns |
| ORDER BY columns | Defines which row to pick first within each group; must start with DISTINCT ON columns |
| Usage | Use to get one row per group with control over which row is chosen |
| Difference from DISTINCT | DISTINCT removes all duplicates; DISTINCT ON keeps first row per group |
Key Takeaways
DISTINCT ON returns the first row for each unique value in specified columns.
Always use ORDER BY starting with the same columns as DISTINCT ON to control which row is returned.
DISTINCT ON is PostgreSQL-specific and differs from standard DISTINCT.
Without ORDER BY, DISTINCT ON will cause an error.
Use DISTINCT ON to simplify queries that need one row per group with custom ordering.