Subqueries in WHERE with IN in PostgreSQL - Time & Space Complexity
When using a subquery inside a WHERE clause with IN, the database checks if values match a list from another query.
We want to know how the time to run this query changes as the size of the data grows.
Analyze the time complexity of the following code snippet.
SELECT employee_id, name
FROM employees
WHERE department_id IN (
SELECT department_id
FROM departments
WHERE location = 'New York'
);
This query finds employees who work in departments located in New York.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: For each employee, check if their department_id is in the list returned by the subquery.
- How many times: This check happens once for every employee row.
As the number of employees grows, the database must check more rows against the subquery result.
| Input Size (n employees) | Approx. Operations |
|---|---|
| 10 | About 10 checks against the subquery list |
| 100 | About 100 checks |
| 1000 | About 1000 checks |
Pattern observation: The number of operations grows roughly in direct proportion to the number of employees.
Time Complexity: O(n)
This means the time to run the query grows linearly with the number of employees.
[X] Wrong: "The subquery runs once and does not affect the total time much."
[OK] Correct: The subquery result is used for every employee check, so its size and how the database handles it impact the total time.
Understanding how subqueries affect query time helps you write efficient database queries and explain your reasoning clearly in interviews.
"What if we replaced IN with EXISTS? How would the time complexity change?"