Challenge - 5 Problems
Split-apply-combine Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
❓ Predict Output
intermediate2:00remaining
Output of groupby and aggregation
What is the output of this code snippet using pandas split-apply-combine?
Pandas
import pandas as pd data = pd.DataFrame({ 'Team': ['A', 'A', 'B', 'B', 'C'], 'Points': [10, 15, 10, 5, 20] }) result = data.groupby('Team')['Points'].sum() print(result)
Attempts:
2 left
💡 Hint
Think about what sum() does after grouping by 'Team'.
✗ Incorrect
The code groups data by 'Team' and sums the 'Points' for each group. Team A has 10+15=25, B has 10+5=15, and C has 20.
❓ data_output
intermediate2:00remaining
Resulting DataFrame after applying mean aggregation
Given the DataFrame below, what is the resulting DataFrame after applying groupby and mean aggregation on 'Category'?
Pandas
import pandas as pd data = pd.DataFrame({ 'Category': ['X', 'X', 'Y', 'Y', 'Z'], 'Value': [4, 6, 8, 2, 10] }) result = data.groupby('Category', as_index=False).mean() print(result)
Attempts:
2 left
💡 Hint
Remember that mean averages the values per group.
✗ Incorrect
Grouping by 'Category' and taking mean averages the 'Value' column per category. X: (4+6)/2=5, Y: (8+2)/2=5, Z: 10.
🔧 Debug
advanced2:00remaining
Identify the error in groupby aggregation code
What error does this code raise when run?
Pandas
import pandas as pd data = pd.DataFrame({ 'Group': ['G1', 'G1', 'G2'], 'Score': [10, 20, 30] }) result = data.groupby('Group').agg({'Score': 'sum', 'Age': 'mean'}) print(result)
Attempts:
2 left
💡 Hint
Check if all columns in agg dictionary exist in DataFrame.
✗ Incorrect
The DataFrame has no 'Age' column, so trying to aggregate 'Age' causes a KeyError.
🚀 Application
advanced2:30remaining
Calculate median per group and filter groups
You have a DataFrame with sales data. Which code correctly calculates the median sales per 'Store' and returns only stores with median sales above 100?
Pandas
import pandas as pd data = pd.DataFrame({ 'Store': ['S1', 'S1', 'S2', 'S2', 'S3'], 'Sales': [120, 80, 150, 200, 90] })
Attempts:
2 left
💡 Hint
Median is the middle value, not the average or sum.
✗ Incorrect
Option A correctly groups by 'Store', calculates median sales, then filters stores with median > 100.
🧠 Conceptual
expert1:30remaining
Understanding split-apply-combine with custom aggregation
Which option best describes what happens in the split-apply-combine process when using a custom aggregation function in pandas groupby?
Attempts:
2 left
💡 Hint
Think about the order: split, apply, then combine.
✗ Incorrect
Split-apply-combine means data is split into groups, the function is applied to each group, then results are combined.