Subquery with IN operator in SQL - Time & Space Complexity
When using a subquery with the IN operator, we want to know how the time to run the query changes as the data grows.
We ask: How does the database handle checking many values inside the IN list?
Analyze the time complexity of the following code snippet.
SELECT employee_name
FROM employees
WHERE department_id IN (
SELECT department_id
FROM departments
WHERE location = 'New York'
);
This query finds employees who work in departments located in New York by using a subquery inside the IN operator.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: For each employee, the database checks if their department_id is in the list returned by the subquery.
- How many times: This check happens once per employee row, so it repeats as many times as there are employees.
Explain the growth pattern intuitively.
| Input Size (n employees) | Approx. Operations |
|---|---|
| 10 | About 10 checks against the subquery list |
| 100 | About 100 checks against the subquery list |
| 1000 | About 1000 checks against the subquery list |
Pattern observation: The number of checks grows directly with the number of employees.
Time Complexity: O(n * m)
This means the time to run the query grows roughly in proportion to the number of employees times the number of departments in New York.
[X] Wrong: "The subquery runs once and does not affect the total time much."
[OK] Correct: The subquery result is checked for each employee, so its size and how the database handles it impact the total time.
Understanding how subqueries with IN work helps you explain query performance clearly and shows you can think about how databases handle data behind the scenes.
"What if we replaced the IN operator with an EXISTS clause? How would the time complexity change?"