Jump into concepts and practice - no test required
or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Recall & Review
beginner
What is a Nested Loop Join in SQL?
A Nested Loop Join compares each row of one table with each row of another table to find matching rows. It works like checking every item in one list against every item in another list.
Click to reveal answer
intermediate
How does a Hash Join work?
A Hash Join builds a fast lookup table (hash table) from one table's join column, then scans the other table to find matches quickly using the hash table.
Click to reveal answer
intermediate
What is the main requirement for a Merge Join to work efficiently?
Both tables must be sorted on the join columns. Merge Join then walks through both tables in order, matching rows like merging two sorted lists.
Click to reveal answer
advanced
When is Nested Loop Join preferred over Hash or Merge Join?
When one table is very small or when indexes exist on join columns, Nested Loop Join can be faster because it avoids building extra structures.
Click to reveal answer
intermediate
Why might a Hash Join be faster than a Nested Loop Join?
Because Hash Join uses a hash table to quickly find matching rows, it avoids checking every pair of rows, making it faster for large tables without indexes.
Click to reveal answer
Which join algorithm requires both tables to be sorted on the join keys?
ACross Join
BHash Join
CNested Loop Join
DMerge Join
✗ Incorrect
Merge Join works efficiently only if both tables are sorted on the join columns.
What is the main data structure used in a Hash Join?
AHash Table
BArray
CLinked List
DBinary Tree
✗ Incorrect
Hash Join builds a hash table from one table's join column to quickly find matching rows.
When is Nested Loop Join usually the best choice?
AWhen one table is small or indexed
BWhen both tables are large and unsorted
CWhen tables are sorted
DWhen no join condition exists
✗ Incorrect
Nested Loop Join is efficient when one table is small or when indexes exist on join columns.
Which join algorithm can be slowest for large tables without indexes?
AHash Join
BMerge Join
CNested Loop Join
DIndex Join
✗ Incorrect
Nested Loop Join checks every pair of rows, which can be very slow for large tables without indexes.
What is a key advantage of Merge Join over Hash Join?
AWorks without sorting
BUses less memory
CFaster on unsorted data
DBuilds hash tables
✗ Incorrect
Merge Join uses less memory because it processes sorted data sequentially without building hash tables.
Explain how Nested Loop Join, Hash Join, and Merge Join differ in their approach to joining tables.
Think about how each method finds matching rows.
You got /3 concepts.
Describe scenarios where each join algorithm (Nested Loop, Hash, Merge) is most efficient.
Consider table size and sorting.
You got /3 concepts.
Practice
(1/5)
1. Which join algorithm in PostgreSQL is best suited for small tables or when one table is very small compared to the other?
easy
A. Index Join
B. Hash Join
C. Nested Loop Join
D. Merge Join
Solution
Step 1: Understand Nested Loop Join usage
Nested Loop Join works by scanning one table and for each row scanning the other table. It is efficient when one table is small.
Step 2: Compare with other joins
Hash Join is better for large unsorted tables, Merge Join requires sorted inputs. Nested Loop is simplest and best for small tables.
Final Answer:
Nested Loop Join -> Option C
Quick Check:
Small table + Nested Loop Join = best [OK]
Hint: Small table joins usually use Nested Loop Join [OK]
Common Mistakes:
Confusing Hash Join as best for small tables
Thinking Merge Join works well without sorted data
Assuming Index Join is a separate join algorithm
2. Which of the following is the correct syntax to hint PostgreSQL to use a Hash Join in a query?
easy
A. SELECT /*+ HashJoin */ * FROM table1 JOIN table2 ON table1.id = table2.id;
B. SET enable_hashjoin = on; SELECT * FROM table1 JOIN table2 ON table1.id = table2.id;
C. SELECT * FROM table1 HASH JOIN table2 ON table1.id = table2.id;
D. SELECT * FROM table1 JOIN table2 USING HASH(id);
Solution
Step 1: Understand PostgreSQL join hints
PostgreSQL does not support inline join hints like /*+ HashJoin */ or HASH JOIN syntax.
Step 2: Use configuration to enable Hash Join
We can enable or disable join types using SET commands, e.g., SET enable_hashjoin = on; before the query.
Final Answer:
SET enable_hashjoin = on; SELECT ... -> Option B
Quick Check:
PostgreSQL uses SET to enable join types [OK]
Hint: Use SET enable_hashjoin to control hash join usage [OK]
Common Mistakes:
Using Oracle-style hints like /*+ HashJoin */
Trying to write HASH JOIN in SQL syntax
Using USING HASH() which is invalid
3. Given two tables employees(emp_id, dept_id) and departments(dept_id, name), what join algorithm will PostgreSQL most likely use for this query?
EXPLAIN SELECT * FROM employees JOIN departments ON employees.dept_id = departments.dept_id;
Assuming both tables are large and departments.dept_id is indexed.
medium
A. Nested Loop Join
B. Merge Join
C. Cross Join
D. Hash Join
Solution
Step 1: Analyze table sizes and indexes
Both tables are large, so Nested Loop is inefficient. Departments has an index on dept_id.
Step 2: Determine join algorithm choice
Hash Join is preferred for large tables without sorted data. Merge Join requires sorted inputs, which is not guaranteed here.
Final Answer:
Hash Join -> Option D
Quick Check:
Large tables + no sorted data = Hash Join [OK]
Hint: Large tables with join keys use Hash Join by default [OK]
Common Mistakes:
Assuming index forces Nested Loop Join
Thinking Merge Join is automatic without sorting
Confusing Cross Join with inner join
4. You wrote this query:
SELECT * FROM orders o JOIN customers c ON o.customer_id = c.customer_id;
But PostgreSQL is using a Nested Loop Join causing slow performance. Which fix will most likely improve performance by enabling a better join algorithm?
medium
A. Disable Nested Loop Join with SET enable_nestloop = off;
B. Create an index on orders.customer_id
C. Rewrite query using LEFT JOIN instead of JOIN
D. Add ORDER BY on customer_id in the query
Solution
Step 1: Identify why Nested Loop is slow
Nested Loop is slow on large tables without indexes or when better joins exist but are not chosen.
Step 2: Force PostgreSQL to avoid Nested Loop
Disabling Nested Loop join with SET enable_nestloop = off forces PostgreSQL to pick Hash or Merge Join, improving performance.
Final Answer:
Disable Nested Loop Join with SET enable_nestloop = off; -> Option A
Quick Check:
Disable Nested Loop to force better join [OK]
Hint: Disable nested loop join to force hash or merge join [OK]
Common Mistakes:
Assuming adding index always fixes join choice
Changing JOIN type without understanding join algorithms
Adding ORDER BY does not affect join algorithm
5. You have two large sorted tables sales(date, product_id, amount) and products(product_id, name). You want to join them on product_id efficiently. Which join algorithm should you prefer and why?
hard
A. Merge Join, because it exploits sorted order for fast merging
B. Hash Join, because it hashes the smaller table for fast lookup
C. Nested Loop Join, because it works well with sorted data
D. Cross Join, because it combines all rows
Solution
Step 1: Identify join algorithm suited for sorted tables
Merge Join is designed to efficiently join two sorted inputs by merging them in order.
Step 2: Compare with other join algorithms
Nested Loop is inefficient for large tables, Hash Join ignores sorting, Cross Join produces Cartesian product.
Final Answer:
Merge Join, because it exploits sorted order for fast merging -> Option A
Quick Check:
Sorted tables + Merge Join = efficient join [OK]
Hint: Use Merge Join when both tables are sorted on join keys [OK]