Correlated subqueries execution model in PostgreSQL - Time & Space Complexity
When using correlated subqueries, the database runs a small query repeatedly for each row of the main query.
We want to understand how the total work grows as the main table gets bigger.
Analyze the time complexity of the following code snippet.
SELECT e.employee_id, e.name
FROM employees e
WHERE e.salary > (
SELECT AVG(salary)
FROM employees
WHERE department_id = e.department_id
);
This query finds employees whose salary is above the average salary in their department using a correlated subquery.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: For each employee row, the subquery calculates the average salary for that employee's department.
- How many times: The subquery runs once per employee in the main query.
As the number of employees grows, the subquery runs more times, once per employee.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | About 10 subquery executions |
| 100 | About 100 subquery executions |
| 1000 | About 1000 subquery executions |
Pattern observation: The total work grows roughly in direct proportion to the number of employees.
Time Complexity: O(n * m)
This means the total work grows linearly with the number of rows in the main table times the average number of employees per department.
[X] Wrong: "The subquery runs only once, so the query is very fast regardless of table size."
[OK] Correct: Because the subquery depends on each row of the main query, it runs repeatedly, increasing total work as the table grows.
Understanding how correlated subqueries scale helps you write efficient queries and explain performance in real projects.
"What if we replaced the correlated subquery with a join and group by? How would the time complexity change?"