Recursive CTE for graph traversal in PostgreSQL - Time & Space Complexity
When using recursive queries to explore graphs, it's important to know how the work grows as the graph gets bigger.
We want to understand how the number of steps changes when the graph has more nodes and edges.
Analyze the time complexity of the following code snippet.
WITH RECURSIVE traverse AS (
SELECT id, parent_id
FROM nodes
WHERE id = 1
UNION ALL
SELECT n.id, n.parent_id
FROM nodes n
JOIN traverse t ON n.parent_id = t.id
)
SELECT * FROM traverse;
This query starts from node 1 and recursively finds all connected child nodes in the graph.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: Recursive join to find child nodes for each node found.
- How many times: Once for each node reachable from the start node.
Each new node found triggers a join to find its children, so the work grows with the number of reachable nodes and edges.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | About 10 node checks and their edges |
| 100 | About 100 node checks and their edges |
| 1000 | About 1000 node checks and their edges |
Pattern observation: The operations grow roughly linearly with the number of nodes and edges visited.
Time Complexity: O(n + m)
This means the time grows in proportion to the number of nodes (n) plus the number of edges (m) explored in the graph.
[X] Wrong: "The recursive query runs in constant time regardless of graph size."
[OK] Correct: Each node and edge must be checked once, so the work grows as the graph grows.
Understanding recursive queries and their time complexity helps you explain how databases handle graph data efficiently, a useful skill in many real-world tasks.
"What if the graph contains cycles and the query does not prevent revisiting nodes? How would the time complexity change?"