Subquery with EXISTS operator in SQL - Time & Space Complexity
We want to understand how the time needed to run a query with an EXISTS subquery changes as the data grows.
Specifically, how does checking for existence inside a subquery affect performance?
Analyze the time complexity of the following code snippet.
SELECT employee_id, employee_name
FROM employees e
WHERE EXISTS (
SELECT 1
FROM sales s
WHERE s.employee_id = e.employee_id
);
This query finds all employees who have made at least one sale by checking if a matching record exists in the sales table.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: For each employee, the database checks the sales table to see if a matching sale exists.
- How many times: This check happens once per employee row.
As the number of employees grows, the database performs more existence checks.
| Input Size (n employees) | Approx. Operations |
|---|---|
| 10 | About 10 existence checks in sales |
| 100 | About 100 existence checks in sales |
| 1000 | About 1000 existence checks in sales |
Pattern observation: The number of checks grows roughly in direct proportion to the number of employees.
Time Complexity: O(n)
This means the time to run the query grows roughly in a straight line as the number of employees increases.
[X] Wrong: "The EXISTS subquery runs once and checks all sales at the same time."
[OK] Correct: Actually, the EXISTS check runs separately for each employee, so the work grows with the number of employees.
Understanding how subqueries with EXISTS scale helps you write efficient queries and explain your reasoning clearly in interviews.
"What if we replaced EXISTS with IN? How would the time complexity change?"