Self join for hierarchical data in SQL - Time & Space Complexity
When we use a self join to find relationships within the same table, it is important to understand how the work grows as the table gets bigger.
We want to know how the number of operations changes when the number of rows increases.
Analyze the time complexity of the following code snippet.
SELECT e1.employee_id, e1.name, e2.name AS manager_name
FROM employees e1
LEFT JOIN employees e2 ON e1.manager_id = e2.employee_id;
This query finds each employee and their manager by joining the employees table to itself.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: For each employee row, the database looks for a matching manager row in the same table.
- How many times: This matching happens once per employee, so it repeats as many times as there are employees.
As the number of employees grows, the database must check more rows to find matches.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | About 10 lookups for managers |
| 100 | About 100 lookups for managers |
| 1000 | About 1000 lookups for managers |
Pattern observation: The number of operations grows directly with the number of employees.
Time Complexity: O(n)
This means the work grows in a straight line as the number of rows increases.
[X] Wrong: "Because it joins the table to itself, the work is squared (n²)."
[OK] Correct: The join uses a key to find one matching row per employee, so it does not check every pair of rows.
Understanding how self joins scale helps you explain how databases handle hierarchical data efficiently, a useful skill in many real projects.
"What if the join condition was missing an index? How would the time complexity change?"