SELECT with PostgreSQL-specific features - Time & Space Complexity
We want to understand how the time to run a SELECT query with PostgreSQL-specific features changes as the data grows.
Specifically, how does the query time grow when using features like window functions or DISTINCT ON?
Analyze the time complexity of the following PostgreSQL query.
SELECT DISTINCT ON (category)
category,
product_name,
price,
ROW_NUMBER() OVER (PARTITION BY category ORDER BY price DESC) AS rank
FROM products
ORDER BY category, price DESC;
This query selects the most expensive product per category using DISTINCT ON and a window function.
Look for repeated actions in the query execution.
- Primary operation: Scanning all rows in the products table.
- How many times: Once for the scan, then sorting and partitioning by category for the window function and DISTINCT ON.
As the number of products grows, the query must process more rows and sort them.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | About 10 rows scanned and sorted |
| 100 | About 100 rows scanned and sorted |
| 1000 | About 1000 rows scanned and sorted |
Pattern observation: The work grows roughly in proportion to the number of rows because sorting and partitioning happen on all rows.
Time Complexity: O(n log n)
This means the time grows a bit faster than the number of rows because sorting is involved.
[X] Wrong: "Using DISTINCT ON or window functions makes the query run in constant time regardless of data size."
[OK] Correct: These features still require scanning and sorting the data, so the time grows with the number of rows.
Understanding how PostgreSQL-specific features affect query time helps you explain your choices clearly and shows you know how databases handle data behind the scenes.
"What if we added an index on the category column? How would that change the time complexity?"