Why subqueries nest queries in MySQL - Performance Analysis
When we use subqueries in MySQL, one query runs inside another. This nesting can affect how long the database takes to get results.
We want to understand how the time to run the whole query changes as the data grows.
Analyze the time complexity of the following code snippet.
SELECT employee_name
FROM employees
WHERE department_id IN (
SELECT department_id
FROM departments
WHERE location = 'New York'
);
This query finds employees who work in departments located in New York by using a subquery inside the WHERE clause.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: The database scans the departments table to find matching department IDs, then scans the employees table to find matching employees.
- How many times: The subquery runs once, and its result is used to filter the employees table, which is scanned once.
Explain the growth pattern intuitively.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 departments, 100 employees | Scan 10 departments + scan 100 employees |
| 100 departments, 1,000 employees | Scan 100 departments + scan 1,000 employees |
| 1,000 departments, 10,000 employees | Scan 1,000 departments + scan 10,000 employees |
Pattern observation: The total work grows roughly with the sum of the sizes of both tables scanned.
Time Complexity: O(m + n)
This means the time grows roughly in a straight line with the number of departments (m) plus the number of employees (n).
[X] Wrong: "The subquery runs once and does not affect the total time much."
[OK] Correct: Even though the subquery runs once, its result affects how many employees are checked, so both parts add up to the total time.
Understanding how nested queries affect performance helps you write better database queries and explain your thinking clearly in interviews.
What if the subquery returned a very large list of department IDs? How would that change the time complexity?