0
0
Pandasdata~20 mins

Left join behavior in Pandas - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
Left Join Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
query_result
intermediate
2:00remaining
What is the output of this left join?

Given two pandas DataFrames df1 and df2:

df1 = pd.DataFrame({"key": [1, 2, 3], "val1": ["A", "B", "C"]})
df2 = pd.DataFrame({"key": [2, 3, 4], "val2": ["X", "Y", "Z"]})

result = df1.merge(df2, on="key", how="left")
print(result)

What will be printed?

Pandas
import pandas as pd
df1 = pd.DataFrame({"key": [1, 2, 3], "val1": ["A", "B", "C"]})
df2 = pd.DataFrame({"key": [2, 3, 4], "val2": ["X", "Y", "Z"]})

result = df1.merge(df2, on="key", how="left")
print(result)
A
   key val1 val2
0    1    A    Z
1    2    B    X
2    3    C    Y
B
   key val1 val2
0    1    A    X
1    2    B    Y
2    3    C    Z
C
   key val1 val2
0    1    A  NaN
1    2    B  NaN
2    3    C  NaN
D
   key val1 val2
0    1    A  NaN
1    2    B    X
2    3    C    Y
Attempts:
2 left
💡 Hint

Remember, a left join keeps all rows from the left DataFrame and matches rows from the right DataFrame where keys are equal. If no match, NaN appears.

🧠 Conceptual
intermediate
1:30remaining
What happens to unmatched rows in a left join?

When performing a left join between two tables, what happens to rows in the left table that have no matching key in the right table?

AThey appear in the result with NULL or NaN values for the right table's columns.
BThey are dropped from the result.
CThey cause an error and stop the join.
DThey are duplicated for every row in the right table.
Attempts:
2 left
💡 Hint

Think about what a left join means: all rows from the left table must appear.

📝 Syntax
advanced
2:00remaining
Which code correctly performs a left join on columns with different names?

You have two DataFrames:

df1 = pd.DataFrame({"id": [1, 2], "val1": ["A", "B"]})
df2 = pd.DataFrame({"key": [2, 3], "val2": ["X", "Y"]})

Which code correctly performs a left join where df1.id matches df2.key?

Adf1.merge(df2, on=["id", "key"], how="left")
Bdf1.merge(df2, left_on="id", right_on="key", how="left")
Cdf1.merge(df2, on="id", how="left")
Ddf1.merge(df2, left_index=True, right_index=True, how="left")
Attempts:
2 left
💡 Hint

When column names differ, you must specify which columns to join on from each DataFrame.

optimization
advanced
2:30remaining
How to optimize a left join on large DataFrames?

You have two very large DataFrames and want to perform a left join on a key column. Which approach can improve performance?

ASet the key columns as indexes before joining using <code>set_index</code>.
BConvert the DataFrames to lists and join manually with loops.
CUse <code>concat</code> instead of <code>merge</code>.
DDrop all columns except the key before joining.
Attempts:
2 left
💡 Hint

Indexes can speed up lookups during joins.

🔧 Debug
expert
3:00remaining
Why does this left join produce unexpected duplicated rows?

Given:

df1 = pd.DataFrame({"key": [1, 2, 2], "val1": ["A", "B", "C"]})
df2 = pd.DataFrame({"key": [2], "val2": ["X"]})

result = df1.merge(df2, on="key", how="left")
print(result)

The output is:

   key val1 val2
0    1    A  NaN
1    2    B    X
2    2    C    X

Why are there two rows with key=2 in the result?

ABecause the merge function was called with wrong parameters.
BBecause df2 has duplicate keys, causing multiple matches.
CBecause df1 has duplicate keys, the join repeats matching rows for each duplicate.
DBecause left join removes duplicates automatically.
Attempts:
2 left
💡 Hint

Think about how joins behave when the left table has repeated keys.