Why subqueries are needed in SQL - Performance Analysis
We want to understand how the time needed to run a query changes when it uses subqueries.
How does adding a subquery affect the work the database does?
Analyze the time complexity of the following code snippet.
SELECT employee_id, name
FROM employees
WHERE department_id IN (
SELECT department_id
FROM departments
WHERE location = 'New York'
);
This query finds employees who work in departments located in New York using a subquery.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: The database first runs the subquery to find matching departments.
- How many times: The subquery runs once, then the main query checks each employee against the subquery results.
Explain the growth pattern intuitively.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 employees, 3 departments | Runs subquery once (3 results), checks 10 employees |
| 100 employees, 10 departments | Runs subquery once (10 results), checks 100 employees |
| 1000 employees, 50 departments | Runs subquery once (50 results), checks 1000 employees |
Pattern observation: The subquery runs once, then the main query work grows with the number of employees.
Time Complexity: O(n + m)
This means the total work grows roughly with the size of both the subquery results and the main query rows.
[X] Wrong: "The subquery runs for every employee row."
[OK] Correct: The database usually runs the subquery once and reuses its results, so it does not repeat the subquery many times.
Understanding how subqueries affect query time helps you write clear and efficient database queries, a useful skill in many real projects.
"What if the subquery returned a very large list? How would that change the time complexity?"