Array aggregation with ARRAY_AGG in PostgreSQL - Time & Space Complexity
When using ARRAY_AGG in PostgreSQL, we want to understand how the time to build the array grows as the number of rows increases.
We ask: How does the work change when we aggregate more rows into an array?
Analyze the time complexity of the following code snippet.
SELECT department_id, ARRAY_AGG(employee_name) AS employees
FROM employees
GROUP BY department_id;
This query groups employees by their department and collects all employee names into an array for each department.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: Scanning each employee row once to add their name to the array.
- How many times: Once per employee row, so the number of times equals the total number of employees.
As the number of employees grows, the work to add each name to the array grows linearly.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | About 10 array insertions |
| 100 | About 100 array insertions |
| 1000 | About 1000 array insertions |
Pattern observation: The work grows directly with the number of rows aggregated.
Time Complexity: O(n)
This means the time to build the array grows in a straight line as more rows are added.
[X] Wrong: "ARRAY_AGG runs in constant time no matter how many rows there are."
[OK] Correct: Each row must be processed and added to the array, so more rows mean more work.
Understanding how aggregation functions scale helps you explain query performance clearly and shows you know how databases handle data growth.
"What if we used ARRAY_AGG with a WHERE clause filtering half the rows? How would the time complexity change?"