0
0
PostgreSQLquery~10 mins

Join algorithms (nested loop, hash, merge) in PostgreSQL - Step-by-Step Execution

Choose your learning style9 modes available
Concept Flow - Join algorithms (nested loop, hash, merge)
Start Join Operation
Choose Join Algorithm
Nested Loop
Scan Outer
Scan Inner
Output Matches
End
The join operation starts by choosing one of three algorithms: nested loop, hash join, or merge join. Each algorithm follows its own steps to find matching rows and produce the joined result.
Execution Sample
PostgreSQL
SELECT * FROM A JOIN B ON A.id = B.id;
-- Using Nested Loop Join
-- Outer table: A
-- Inner table: B
-- For each row in A, scan B for matches
This query joins tables A and B on the id column using a nested loop join, scanning B for each row in A.
Execution Table
StepOuter Row (A.id)Inner Scan Position (B.id)Match Found?ActionOutput Rows
1A.id=1B.id=1YesOutput row (1,1)1
2A.id=1B.id=2NoContinue scanning B1
3A.id=2B.id=1NoContinue scanning B1
4A.id=2B.id=2YesOutput row (2,2)2
5A.id=3B.id=1NoContinue scanning B2
6A.id=3B.id=2NoContinue scanning B2
7A.id=3B.id=3YesOutput row (3,3)3
8---No more outer rows3
💡 All outer rows scanned; join complete with 3 matching output rows.
Variable Tracker
VariableStartAfter Step 1After Step 4After Step 7Final
Outer Row (A.id)None123End
Inner Scan Position (B.id)None1-21-21-3End
Output Rows Count01233
Key Moments - 3 Insights
Why does the inner scan restart for each outer row in nested loop join?
Because nested loop join scans the entire inner table for each outer row to find matches, as shown in execution_table rows 1-7 where inner scan position resets for each outer row.
How does hash join avoid scanning the inner table multiple times?
Hash join builds a hash table from the inner table once, then probes it for each outer row, avoiding repeated scans. This differs from nested loop join's repeated scanning.
Why must both tables be sorted in merge join?
Merge join requires both tables sorted on join keys to efficiently merge rows by advancing pointers, unlike nested loop or hash join.
Visual Quiz - 3 Questions
Test your understanding
Look at the execution_table, what is the Output Rows count after step 4?
A1
B2
C3
D0
💡 Hint
Check the Output Rows column at step 4 in the execution_table.
At which step does the nested loop join finish scanning all outer rows?
AStep 8
BStep 5
CStep 7
DStep 3
💡 Hint
Look for the step where the action says 'No more outer rows' in the execution_table.
If the inner table B had no matching rows, how would the Output Rows column change in the execution_table?
AIt would increase by one each step
BIt would show random values
CIt would remain zero throughout
DIt would be negative
💡 Hint
Refer to the Output Rows column and consider what happens when no matches are found.
Concept Snapshot
Join algorithms combine rows from two tables based on a condition.
Nested Loop: For each row in outer table, scan inner table.
Hash Join: Build hash on inner table, probe with outer rows.
Merge Join: Sort both tables, merge by advancing pointers.
Choose algorithm based on data size and indexes for best performance.
Full Transcript
This visual execution shows how join algorithms work in PostgreSQL. The join starts by choosing an algorithm: nested loop, hash join, or merge join. Nested loop join scans the inner table for each outer row, outputting matches. The execution table traces each step scanning rows from tables A and B, showing when matches are found and output rows increase. Variable tracker shows how outer row, inner scan position, and output count change. Key moments clarify why inner scans restart in nested loops, how hash join avoids repeated scans, and why merge join needs sorted tables. The quiz tests understanding of output counts and termination steps. The snapshot summarizes join algorithms and their behavior for quick reference.