FIRST_VALUE and LAST_VALUE in PostgreSQL - Time & Space Complexity
We want to understand how the time to get the first or last value in a group grows as the data size increases.
How does the database handle these window functions when the input gets bigger?
Analyze the time complexity of the following code snippet.
SELECT
department,
employee_id,
salary,
FIRST_VALUE(salary) OVER (PARTITION BY department ORDER BY salary DESC) AS highest_salary,
LAST_VALUE(salary) OVER (PARTITION BY department ORDER BY salary DESC ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS lowest_salary
FROM employees;
This query finds the highest and lowest salary per department using FIRST_VALUE and LAST_VALUE window functions.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: Scanning and ordering rows within each department group.
- How many times: Once per row, but ordering happens per group which depends on group size.
As the number of rows grows, the database must sort each department's rows to find first and last values.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | About 10 sorting steps within groups |
| 100 | Sorting grows, more steps per group |
| 1000 | Sorting cost increases roughly with n log n per group |
Pattern observation: Sorting cost grows faster than the number of rows because ordering is needed per group.
Time Complexity: O(n log n)
This means the time to compute first and last values grows a bit faster than the number of rows because sorting is involved.
[X] Wrong: "FIRST_VALUE and LAST_VALUE just pick values instantly without extra work."
[OK] Correct: The database must order rows within each group first, which takes extra time especially as data grows.
Understanding how window functions like FIRST_VALUE and LAST_VALUE scale helps you explain query performance clearly and shows you know how databases handle grouped data.
What if we removed the ORDER BY clause inside the window function? How would the time complexity change?