Bird
Raised Fist0
DBMS Theoryknowledge~5 mins

Join algorithms (nested loop, sort-merge, hash join) in DBMS Theory - Cheat Sheet & Quick Revision

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Recall & Review
beginner
What is a Nested Loop Join in database systems?
A Nested Loop Join compares each row of one table with every row of another table to find matching pairs. It is simple but can be slow for large tables.
Click to reveal answer
intermediate
How does a Sort-Merge Join work?
Sort-Merge Join first sorts both tables on the join key, then merges them by scanning through both sorted lists to find matching rows efficiently.
Click to reveal answer
intermediate
What is the main idea behind a Hash Join?
Hash Join builds a hash table on the smaller table using the join key, then scans the larger table to find matches by looking up the hash table, making it fast for large data.
Click to reveal answer
beginner
Which join algorithm is generally best for small tables?
Nested Loop Join is often best for small tables because it is simple and does not require sorting or hashing.
Click to reveal answer
intermediate
Why is Sort-Merge Join efficient for already sorted data?
Because it can merge the two sorted tables in a single pass without extra sorting, reducing the time needed to find matching rows.
Click to reveal answer
Which join algorithm compares every row of one table with every row of another?
AIndex Join
BSort-Merge Join
CNested Loop Join
DHash Join
What is the first step in a Sort-Merge Join?
ASort both tables on the join key
BBuild a hash table
CScan the larger table
DCompare each row pair
Hash Join is most efficient when:
ABoth tables are very small
BOne table is much smaller than the other
CBoth tables are already sorted
DTables have no indexes
Which join algorithm requires sorting the tables?
ASort-Merge Join
BHash Join
CNested Loop Join
DCartesian Join
Which join algorithm is simplest but can be slow for large tables?
AHash Join
BSort-Merge Join
CIndex Join
DNested Loop Join
Explain how a Hash Join works and why it is efficient for large tables.
Think about dividing work between small and large tables using a hash.
You got /4 concepts.
    Compare Nested Loop Join and Sort-Merge Join in terms of when each is best used.
    Consider table size and sorting when choosing the join.
    You got /4 concepts.

      Practice

      (1/5)
      1. Which join algorithm compares each row of one table with every row of another table to find matching pairs?
      easy
      A. Index join
      B. Sort-merge join
      C. Hash join
      D. Nested loop join

      Solution

      1. Step 1: Understand the nested loop join process

        Nested loop join works by taking each row from the first table and comparing it with every row in the second table to find matches.
      2. Step 2: Compare with other join types

        Sort-merge join sorts and merges, hash join uses hashing, and index join uses indexes, so they do not compare every pair.
      3. Final Answer:

        Nested loop join -> Option D
      4. Quick Check:

        Every row compared = Nested loop join [OK]
      Hint: Nested loop = check all pairs one by one [OK]
      Common Mistakes:
      • Confusing nested loop with hash join
      • Thinking sort-merge compares all pairs
      • Assuming index join is same as nested loop
      2. Which of the following is the correct description of a hash join algorithm?
      easy
      A. Sort both tables and merge matching rows
      B. Use a hash table to find matching rows quickly
      C. Compare each row of one table with every row of another
      D. Use indexes to speed up join operations

      Solution

      1. Step 1: Recall hash join working

        Hash join builds a hash table on one table's join key to quickly find matching rows from the other table.
      2. Step 2: Eliminate other options

        Sorting and merging is sort-merge join, comparing all pairs is nested loop, and using indexes is index join, not hash join.
      3. Final Answer:

        Use a hash table to find matching rows quickly -> Option B
      4. Quick Check:

        Hash table = fast matching [OK]
      Hint: Hash join uses hash table for fast lookup [OK]
      Common Mistakes:
      • Mixing hash join with sort-merge join
      • Thinking hash join compares all pairs
      • Confusing index usage with hash join
      3. Consider two tables A and B, each with 1000 rows. Which join algorithm is likely to perform best if both tables are already sorted on the join key?
      medium
      A. Nested loop join
      B. Hash join
      C. Sort-merge join
      D. Cartesian join

      Solution

      1. Step 1: Analyze the condition of sorted tables

        Since both tables are sorted on the join key, sort-merge join can efficiently merge them by scanning once through both tables.
      2. Step 2: Compare with other algorithms

        Nested loop join checks all pairs (slow), hash join builds hash table (extra work), Cartesian join is unrelated and very expensive.
      3. Final Answer:

        Sort-merge join -> Option C
      4. Quick Check:

        Sorted tables = sort-merge join best [OK]
      Hint: Sorted tables? Use sort-merge join [OK]
      Common Mistakes:
      • Choosing nested loop for sorted tables
      • Assuming hash join is always fastest
      • Confusing Cartesian join with normal join
      4. A database query uses a hash join but runs very slowly. Which of the following is a likely cause?
      medium
      A. The hash table does not fit in memory causing disk spills
      B. The join keys are unique in both tables
      C. Both tables are already sorted
      D. The nested loop join was used instead

      Solution

      1. Step 1: Understand hash join memory use

        Hash join builds a hash table in memory. If it is too large, it spills to disk, slowing performance.
      2. Step 2: Evaluate other options

        Sorted tables do not slow hash join, unique keys help performance, and nested loop join is a different algorithm.
      3. Final Answer:

        The hash table does not fit in memory causing disk spills -> Option A
      4. Quick Check:

        Memory overflow = slow hash join [OK]
      Hint: Hash join slow? Check memory size for hash table [OK]
      Common Mistakes:
      • Blaming sorted tables for hash join slowness
      • Ignoring memory limits in hash join
      • Confusing nested loop join with hash join
      5. You have two large tables to join on a key. Table A fits in memory but Table B is very large and unsorted. Which join algorithm is best to minimize disk I/O and why?
      hard
      A. Hash join, because building a hash on smaller Table A allows fast matching
      B. Sort-merge join, because sorting both tables reduces I/O
      C. Cartesian join, because it joins all rows regardless of keys
      D. Nested loop join, because it checks all pairs without sorting

      Solution

      1. Step 1: Consider table sizes and memory

        Table A fits in memory, so building a hash table on it is efficient. Table B is large and unsorted, so sorting it would be expensive.
      2. Step 2: Choose join algorithm minimizing disk I/O

        Hash join uses the smaller table in memory to quickly find matches in the larger table, reducing disk reads compared to sorting or nested loops.
      3. Final Answer:

        Hash join, because building a hash on smaller Table A allows fast matching -> Option A
      4. Quick Check:

        Small table in memory = hash join best [OK]
      Hint: Small table fits memory? Use hash join [OK]
      Common Mistakes:
      • Choosing nested loop for large tables
      • Preferring sort-merge without considering sorting cost
      • Confusing Cartesian join with normal join