0
0
Data Analysis Pythondata~20 mins

Merging on multiple keys in Data Analysis Python - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
Master of Merging on Multiple Keys
Get all challenges correct to earn this badge!
Test your skills under time pressure!
Predict Output
intermediate
2:00remaining
Output of merging two DataFrames on multiple keys
What is the output DataFrame after merging df1 and df2 on columns 'key1' and 'key2' using an inner join?
Data Analysis Python
import pandas as pd

df1 = pd.DataFrame({
    'key1': ['A', 'B', 'C', 'A'],
    'key2': [1, 2, 3, 2],
    'value1': [10, 20, 30, 40]
})

df2 = pd.DataFrame({
    'key1': ['A', 'B', 'A', 'D'],
    'key2': [1, 2, 2, 4],
    'value2': [100, 200, 300, 400]
})

result = pd.merge(df1, df2, on=['key1', 'key2'], how='inner')
print(result)
A
  key1  key2  value1  value2
0    A     1      10     100
1    B     2      20     200
2    D     4      40     400
B
  key1  key2  value1  value2
0    A     1      10     100
1    B     2      20     200
2    C     3      30     300
C
  key1  key2  value1  value2
0    A     1      10     100
1    B     2      20     200
2    A     2      40     300
D
  key1  key2  value1  value2
0    A     1      10     100
1    B     2      20     200
2    A     2      40     400
Attempts:
2 left
💡 Hint
Look for rows where both 'key1' and 'key2' match in both DataFrames.
data_output
intermediate
1:30remaining
Number of rows after merging on multiple keys with outer join
After merging df1 and df2 on ['key1', 'key2'] using an outer join, how many rows does the resulting DataFrame have?
Data Analysis Python
import pandas as pd

df1 = pd.DataFrame({
    'key1': ['X', 'Y', 'Z'],
    'key2': [1, 2, 3],
    'val1': [5, 10, 15]
})

df2 = pd.DataFrame({
    'key1': ['X', 'Y', 'W'],
    'key2': [1, 4, 3],
    'val2': [50, 40, 30]
})

merged = pd.merge(df1, df2, on=['key1', 'key2'], how='outer')
print(len(merged))
A3
B5
C4
D6
Attempts:
2 left
💡 Hint
Count all unique pairs of keys from both DataFrames combined.
🔧 Debug
advanced
1:30remaining
Identify the error in merging on multiple keys
What error will this code raise when trying to merge df1 and df2 on ['key1', 'key3']?
Data Analysis Python
import pandas as pd

df1 = pd.DataFrame({
    'key1': ['A', 'B'],
    'key2': [1, 2],
    'value': [100, 200]
})

df2 = pd.DataFrame({
    'key1': ['A', 'B'],
    'key3': [1, 2],
    'value': [300, 400]
})

result = pd.merge(df1, df2, on=['key1', 'key3'])
AKeyError: 'key3'
BTypeError: merge() got an unexpected keyword argument 'on'
CValueError: columns overlap but no suffix specified
DNo error, merge runs successfully
Attempts:
2 left
💡 Hint
Check if both DataFrames have all columns specified in 'on'.
🚀 Application
advanced
2:00remaining
Result of merging with suffixes on overlapping columns
What will be the output DataFrame after merging df1 and df2 on ['id', 'date'] with suffixes ('_left', '_right')?
Data Analysis Python
import pandas as pd

df1 = pd.DataFrame({
    'id': [1, 2],
    'date': ['2023-01-01', '2023-01-02'],
    'value': [10, 20]
})

df2 = pd.DataFrame({
    'id': [1, 2],
    'date': ['2023-01-01', '2023-01-02'],
    'value': [100, 200]
})

merged = pd.merge(df1, df2, on=['id', 'date'], suffixes=('_left', '_right'))
print(merged)
A
   id        date  value_right  value
0   1  2023-01-01         100     10
1   2  2023-01-02         200     20
B
   id        date  value  value
0   1  2023-01-01     10    100
1   2  2023-01-02     20    200
C
   id        date  value_left  value
0   1  2023-01-01          10    100
1   2  2023-01-02          20    200
D
   id        date  value_left  value_right
0   1  2023-01-01          10          100
1   2  2023-01-02          20          200
Attempts:
2 left
💡 Hint
Suffixes are added to overlapping column names except the keys.
🧠 Conceptual
expert
2:00remaining
Understanding merge behavior with duplicate keys
Given df1 and df2 below, how many rows will the merged DataFrame have after merging on ['key1', 'key2'] with an inner join?
Data Analysis Python
import pandas as pd

df1 = pd.DataFrame({
    'key1': ['A', 'A'],
    'key2': [1, 1],
    'val1': [10, 20]
})

df2 = pd.DataFrame({
    'key1': ['A', 'A'],
    'key2': [1, 1],
    'val2': [100, 200]
})

merged = pd.merge(df1, df2, on=['key1', 'key2'], how='inner')
print(len(merged))
A4
B3
C1
D2
Attempts:
2 left
💡 Hint
Think about how many combinations are formed when keys are duplicated in both DataFrames.