0
0
Pandasdata~20 mins

Right join behavior in Pandas - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
Right Join Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
query_result
intermediate
2:00remaining
What is the output of this right join?

Given two DataFrames df1 and df2, what is the result of df1.merge(df2, how='right', on='key')?

Pandas
import pandas as pd

df1 = pd.DataFrame({'key': ['A', 'B', 'C'], 'val1': [1, 2, 3]})
df2 = pd.DataFrame({'key': ['B', 'C', 'D'], 'val2': [4, 5, 6]})

result = df1.merge(df2, how='right', on='key')
print(result)
A
  key  val1  val2
0   A   1.0     NaN
1   B   2.0     4
2   D   NaN     6
B
  key  val1  val2
0   A   1.0     NaN
1   B   2.0     4
2   C   3.0     5
C
  key  val1  val2
0   B   2.0     4
1   C   3.0     5
2   D   NaN     6
D
  key  val1  val2
0   B   2.0     4
1   C   3.0     5
Attempts:
2 left
💡 Hint

Remember, a right join keeps all rows from the right DataFrame and matches rows from the left.

🧠 Conceptual
intermediate
1:30remaining
Which statement best describes a right join?

Choose the correct description of what a right join does in pandas merge.

AReturns all rows from the left DataFrame and matching rows from the right DataFrame.
BReturns all rows from the right DataFrame and matching rows from the left DataFrame.
CReturns only rows that have matching keys in both DataFrames.
DReturns all rows from both DataFrames, filling missing matches with NaN.
Attempts:
2 left
💡 Hint

Think about which DataFrame's rows are always kept in a right join.

📝 Syntax
advanced
2:30remaining
Which code correctly performs a right join on columns 'id' and 'code'?

Given two DataFrames df1 and df2, which code snippet correctly performs a right join on columns 'id' and 'code'?

Pandas
import pandas as pd

# df1 and df2 are predefined DataFrames
Adf1.merge(df2, how='right', left_on=['id', 'code'], right_on=['id', 'code'])
Bdf1.merge(df2, how='right', left_on='id', right_on='code')
Cdf1.merge(df2, how='right', on='id', on='code')
Ddf1.merge(df2, how='right', on=['id', 'code'])
Attempts:
2 left
💡 Hint

Check the correct syntax for joining on multiple columns with the same names.

optimization
advanced
3:00remaining
How to optimize a right join when the right DataFrame is very large?

You have two DataFrames: df1 (small) and df2 (very large). You want to perform a right join. Which approach optimizes performance?

APerform a left join with <code>df2</code> as left and <code>df1</code> as right, then rename columns accordingly.
BPerform a right join directly with <code>df1.merge(df2, how='right')</code> without changes.
CConvert both DataFrames to dictionaries and merge manually in Python.
DSort both DataFrames by join keys before merging.
Attempts:
2 left
💡 Hint

Think about which DataFrame should be on the left for better performance.

🔧 Debug
expert
3:00remaining
Why does this right join produce unexpected NaNs?

Consider this code:

df1 = pd.DataFrame({'key': ['A', 'B'], 'val1': [1, 2]})
df2 = pd.DataFrame({'key': ['a', 'b', 'c'], 'val2': [3, 4, 5]})
result = df1.merge(df2, how='right', on='key')

Why does result contain NaNs in val1 for all rows?

ABecause the keys have different cases ('A' vs 'a'), so no matches occur.
BBecause right join only keeps rows from the left DataFrame.
CBecause the 'on' parameter is missing in the merge call.
DBecause the DataFrames have different column names for the join key.
Attempts:
2 left
💡 Hint

Check if the join keys match exactly including case sensitivity.