Recursive CTE for hierarchical data in PostgreSQL - Time & Space Complexity
When working with hierarchical data, recursive queries help us find all related items step-by-step.
We want to know how the time to get all related data grows as the hierarchy gets bigger.
Analyze the time complexity of the following recursive query.
WITH RECURSIVE subordinates AS (
SELECT employee_id, manager_id
FROM employees
WHERE manager_id IS NULL
UNION ALL
SELECT e.employee_id, e.manager_id
FROM employees e
JOIN subordinates s ON e.manager_id = s.employee_id
)
SELECT * FROM subordinates;
This query finds all employees under the top-level managers by repeatedly joining employees to their managers.
- Primary operation: Recursive join between employees and subordinates.
- How many times: Once per level of the hierarchy until all employees are found.
Each recursion step adds employees reporting to the previous level.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | About 10 joins to find all employees. |
| 100 | About 100 joins as it explores each employee once. |
| 1000 | About 1000 joins, one per employee in the hierarchy. |
Pattern observation: The work grows roughly in direct proportion to the number of employees.
Time Complexity: O(n)
This means the query time grows linearly with the number of employees in the hierarchy.
[X] Wrong: "Recursive queries always take exponential time because they repeat work."
[OK] Correct: The query visits each employee once, so work grows linearly, not exponentially.
Understanding recursive queries helps you handle real-world data like org charts or file systems efficiently.
What if the hierarchy had cycles? How would that affect the time complexity of the recursive query?