0
0
Pandasdata~20 mins

Inner join behavior in Pandas - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
Inner Join Mastery
Get all challenges correct to earn this badge!
Test your skills under time pressure!
query_result
intermediate
2:00remaining
What is the output of this inner join?

Given two dataframes df1 and df2, what is the result of the inner join on column key?

Pandas
import pandas as pd

df1 = pd.DataFrame({'key': ['A', 'B', 'C'], 'val1': [1, 2, 3]})
df2 = pd.DataFrame({'key': ['B', 'C', 'D'], 'val2': [4, 5, 6]})

result = pd.merge(df1, df2, on='key', how='inner')
print(result)
A[{'key': 'A', 'val1': 1, 'val2': 4}, {'key': 'B', 'val1': 2, 'val2': 5}]
B[{'key': 'B', 'val1': 2, 'val2': 4}, {'key': 'C', 'val1': 3, 'val2': 5}]
C[{'key': 'A', 'val1': 1, 'val2': 6}, {'key': 'D', 'val1': 3, 'val2': 6}]
D[{'key': 'A', 'val1': 1, 'val2': null}, {'key': 'B', 'val1': 2, 'val2': 4}, {'key': 'C', 'val1': 3, 'val2': 5}, {'key': 'D', 'val1': null, 'val2': 6}]
Attempts:
2 left
💡 Hint

Inner join returns only rows with matching keys in both dataframes.

🧠 Conceptual
intermediate
1:30remaining
Which statement best describes an inner join?

Choose the correct description of an inner join operation between two tables or dataframes.

AReturns only rows where keys exist in both tables, excluding all others.
BReturns all rows from the left table and matching rows from the right table, filling with nulls if no match.
CReturns all rows from the right table and matching rows from the left table, filling with nulls if no match.
DReturns all rows from both tables, combining matching rows and filling nulls where no match exists.
Attempts:
2 left
💡 Hint

Think about which rows appear in the result when keys do not match.

📝 Syntax
advanced
1:30remaining
Identify the correct syntax for an inner join in pandas

Which of the following code snippets correctly performs an inner join on column id between df_left and df_right?

Adf_left.merge(df_right, on='id', join='inner')
Bdf_left.join(df_right, on='id', how='inner')
Cpd.join(df_left, df_right, on='id', how='inner')
Dpd.merge(df_left, df_right, on='id', how='inner')
Attempts:
2 left
💡 Hint

Recall the correct function and parameter names for pandas merge.

optimization
advanced
2:00remaining
Optimizing inner join performance on large dataframes

You have two large dataframes with millions of rows each. Which approach can improve the performance of an inner join on column user_id?

AEnsure both dataframes have <code>user_id</code> as an index before merging.
BConvert <code>user_id</code> columns to string type before merging.
CDrop duplicates from both dataframes before merging regardless of keys.
DUse <code>how='outer'</code> instead of <code>how='inner'</code> to reduce computation.
Attempts:
2 left
💡 Hint

Indexes can speed up join operations.

🔧 Debug
expert
2:30remaining
Why does this inner join produce an empty dataframe?

Given the code below, why is the result of the inner join empty?

Pandas
import pandas as pd

df1 = pd.DataFrame({'key': ['a', 'b', 'c'], 'val': [1, 2, 3]})
df2 = pd.DataFrame({'key': ['A', 'B', 'C'], 'val': [4, 5, 6]})

result = pd.merge(df1, df2, on='key', how='inner')
print(result)
AThe merge function requires specifying suffixes for overlapping columns.
BThe 'val' columns have different names causing the join to fail.
CThe keys differ in case sensitivity; 'a' != 'A', so no matches occur.
DThe 'how' parameter should be 'left' instead of 'inner' to get results.
Attempts:
2 left
💡 Hint

Check if the keys match exactly including letter case.