0
0
Pandasdata~20 mins

merge() for SQL-like joins in Pandas - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
Master of merge() for SQL-like joins
Get all challenges correct to earn this badge!
Test your skills under time pressure!
Predict Output
intermediate
2:00remaining
Output of inner join with pandas.merge()

What is the output DataFrame after performing an inner join on the 'key' column?

Pandas
import pandas as pd

df1 = pd.DataFrame({'key': ['A', 'B', 'C'], 'val1': [1, 2, 3]})
df2 = pd.DataFrame({'key': ['B', 'C', 'D'], 'val2': [4, 5, 6]})

result = pd.merge(df1, df2, on='key', how='inner')
print(result)
A{'key': ['A', 'B', 'C', 'D'], 'val1': [1, 2, 3, null], 'val2': [null, 4, 5, 6]}
B{'key': ['B', 'C', 'D'], 'val1': [2, 3, null], 'val2': [4, 5, 6]}
C{'key': ['B', 'C'], 'val1': [2, 3], 'val2': [4, 5]}
D{'key': ['A', 'B', 'C'], 'val1': [1, 2, 3], 'val2': [null, 4, 5]}
Attempts:
2 left
💡 Hint

Inner join keeps only keys present in both DataFrames.

data_output
intermediate
1:30remaining
Number of rows after left join with pandas.merge()

How many rows will the resulting DataFrame have after performing a left join on 'key'?

Pandas
import pandas as pd

df1 = pd.DataFrame({'key': ['X', 'Y', 'Z'], 'val1': [10, 20, 30]})
df2 = pd.DataFrame({'key': ['Y', 'Z'], 'val2': [100, 200]})

result = pd.merge(df1, df2, on='key', how='left')
print(len(result))
A3
B2
C4
D1
Attempts:
2 left
💡 Hint

Left join keeps all rows from the left DataFrame.

🔧 Debug
advanced
2:00remaining
Identify the error in merge() usage

What error will this code raise?

Pandas
import pandas as pd

df1 = pd.DataFrame({'id': [1, 2], 'val': ['A', 'B']})
df2 = pd.DataFrame({'key': [1, 2], 'val2': ['X', 'Y']})

result = pd.merge(df1, df2, on='id')
AValueError: columns overlap but no suffix specified
BKeyError: 'id'
CTypeError: merge() missing 1 required positional argument
DNo error, merge runs successfully
Attempts:
2 left
💡 Hint

Check if both DataFrames have the column used in 'on'.

visualization
advanced
2:30remaining
Visualize the result of an outer join

Which option shows the correct DataFrame after an outer join on 'key'?

Pandas
import pandas as pd

df1 = pd.DataFrame({'key': ['A', 'B'], 'val1': [1, 2]})
df2 = pd.DataFrame({'key': ['B', 'C'], 'val2': [3, 4]})

result = pd.merge(df1, df2, on='key', how='outer').sort_values('key').reset_index(drop=True)
print(result)
A
  key  val1  val2
0   A   1.0   NaN
1   B   2.0   3.0
2   C   NaN   4.0
B
  key  val1  val2
0   A   NaN   NaN
1   B   2.0   3.0
2   C   NaN   4.0
C
  key  val1  val2
0   A   1.0   3.0
1   B   2.0   4.0
2   C   NaN   NaN
D
  key  val1  val2
0   A   1.0   4.0
1   B   2.0   3.0
2   C   NaN   NaN
Attempts:
2 left
💡 Hint

Outer join keeps all keys from both DataFrames, filling missing values with NaN.

🚀 Application
expert
3:00remaining
Combine DataFrames with multiple keys and suffixes

You have two DataFrames with overlapping column names besides the keys. Which code correctly merges them on 'id' and 'date' with suffixes to avoid column name conflicts?

Pandas
import pandas as pd

df1 = pd.DataFrame({'id': [1, 2], 'date': ['2023-01-01', '2023-01-02'], 'value': [10, 20]})
df2 = pd.DataFrame({'id': [1, 2], 'date': ['2023-01-01', '2023-01-02'], 'value': [100, 200]})
Apd.merge(df1, df2, on=['id', 'date'], suffixes=['_left', '_right'])
Bpd.merge(df1, df2, on=['id', 'date'], suffixes='_left_right')
Cpd.merge(df1, df2, on=['id', 'date'], suffixes=('left', 'right'))
Dpd.merge(df1, df2, on=['id', 'date'], suffixes=('_left', '_right'))
Attempts:
2 left
💡 Hint

Suffixes must be a tuple of strings.