0
0
Pandasdata~20 mins

Split-apply-combine mental model in Pandas - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
Split-apply-combine Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
Predict Output
intermediate
2:00remaining
Output of groupby and aggregation
What is the output of this code snippet using pandas split-apply-combine?
Pandas
import pandas as pd

data = pd.DataFrame({
    'Team': ['A', 'A', 'B', 'B', 'C'],
    'Points': [10, 15, 10, 5, 20]
})
result = data.groupby('Team')['Points'].sum()
print(result)
A
Team
A    25
B    15
C    20
Name: Points, dtype: int64
B
Team
A    10
B    10
C    20
Name: Points, dtype: int64
C
Team
A    15
B    5
C    20
Name: Points, dtype: int64
D
Team
A    25
B    15
C    20
Name: Team, dtype: int64
Attempts:
2 left
💡 Hint
Think about what sum() does after grouping by 'Team'.
data_output
intermediate
2:00remaining
Resulting DataFrame after applying mean aggregation
Given the DataFrame below, what is the resulting DataFrame after applying groupby and mean aggregation on 'Category'?
Pandas
import pandas as pd

data = pd.DataFrame({
    'Category': ['X', 'X', 'Y', 'Y', 'Z'],
    'Value': [4, 6, 8, 2, 10]
})
result = data.groupby('Category', as_index=False).mean()
print(result)
A
  Category  Value
0        X    10
1        Y    10
2        Z    10
B
  Category  Value
0        X    5.0
1        Y    5.0
2        Z   10.0
C
Category  Value
X         5.0
Y         5.0
Z        10.0
D
Category  Value
X         4
Y         8
Z        10
Attempts:
2 left
💡 Hint
Remember that mean averages the values per group.
🔧 Debug
advanced
2:00remaining
Identify the error in groupby aggregation code
What error does this code raise when run?
Pandas
import pandas as pd

data = pd.DataFrame({
    'Group': ['G1', 'G1', 'G2'],
    'Score': [10, 20, 30]
})
result = data.groupby('Group').agg({'Score': 'sum', 'Age': 'mean'})
print(result)
ANo error, prints sum and mean
BTypeError: unsupported operand type(s) for +: 'int' and 'str'
CKeyError: 'Age'
DValueError: No numeric types to aggregate
Attempts:
2 left
💡 Hint
Check if all columns in agg dictionary exist in DataFrame.
🚀 Application
advanced
2:30remaining
Calculate median per group and filter groups
You have a DataFrame with sales data. Which code correctly calculates the median sales per 'Store' and returns only stores with median sales above 100?
Pandas
import pandas as pd

data = pd.DataFrame({
    'Store': ['S1', 'S1', 'S2', 'S2', 'S3'],
    'Sales': [120, 80, 150, 200, 90]
})
A
result = data.groupby('Store')['Sales'].median()
filtered = result[result > 100]
print(filtered)
B
result = data.groupby('Store').median()['Sales']
filtered = result[result > 100]
print(filtered)
C
result = data.groupby('Store')['Sales'].mean()
filtered = result[result > 100]
print(filtered)
D
result = data.groupby('Store')['Sales'].sum()
filtered = result[result > 100]
print(filtered)
Attempts:
2 left
💡 Hint
Median is the middle value, not the average or sum.
🧠 Conceptual
expert
1:30remaining
Understanding split-apply-combine with custom aggregation
Which option best describes what happens in the split-apply-combine process when using a custom aggregation function in pandas groupby?
AData is split into groups, combined immediately, and then the custom function is applied to the combined data.
BData is combined first, then the custom function is applied to the entire dataset, and finally split into groups.
CThe custom function is applied to the entire dataset without splitting, then results are grouped.
DData is split into groups, the custom function is applied to each group independently, then the results are combined into a new DataFrame or Series.
Attempts:
2 left
💡 Hint
Think about the order: split, apply, then combine.