PostgreSQLquery~5 mins

DISTINCT ON for unique per group in PostgreSQL - Time & Space Complexity

Choose your learning style9 modes available

Learn Why Deep Visual Try Challenge Project Recall Time

Time Complexity: DISTINCT ON for unique per group

O(n log n)

Understanding Time Complexity

We want to understand how the time needed to run a query using DISTINCT ON changes as the data grows.

Specifically, how does finding one unique row per group scale with more rows?

Scenario Under Consideration

Analyze the time complexity of the following code snippet.


SELECT DISTINCT ON (category) category, item, price
FROM products
ORDER BY category, price DESC;

This query selects the most expensive item for each category from the products table.

Identify Repeating Operations

Identify the loops, recursion, array traversals that repeat.

Primary operation: Scanning all rows in the products table to group by category.
How many times: Once over all rows, then sorting them by category and price.

How Execution Grows With Input

As the number of rows grows, the query must look at more data and sort more rows.

Input Size (n)	Approx. Operations
10	About 10 rows scanned and sorted
100	About 100 rows scanned and sorted
1000	About 1000 rows scanned and sorted

Pattern observation: The work grows roughly in proportion to the number of rows because sorting and scanning take more time as data grows.

Final Time Complexity

Time Complexity: O(n log n)

This means the time grows a bit faster than the number of rows because sorting takes extra steps as data grows.

Common Mistake

[X] Wrong: "DISTINCT ON just picks one row per group instantly, so it runs in constant time regardless of data size."

[OK] Correct: The database must first scan and sort the data to find the unique rows per group, so the time depends on how many rows there are.

Interview Connect

Understanding how DISTINCT ON works helps you explain how databases handle grouping and sorting efficiently, a useful skill for real-world data tasks.

Self-Check

What if we added an index on the category and price columns? How would the time complexity change?