Challenge - 5 Problems
Groupby Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
❓ Predict Output
intermediate2:00remaining
Output of groupby() with sum aggregation
What is the output of this code snippet?
Data Analysis Python
import pandas as pd data = {'Team': ['A', 'A', 'B', 'B', 'C'], 'Points': [10, 15, 10, 5, 20]} df = pd.DataFrame(data) result = df.groupby('Team').sum() print(result)
Attempts:
2 left
💡 Hint
Sum adds all points for each team.
✗ Incorrect
The groupby groups rows by 'Team'. Then sum adds 'Points' for each group. Team A has 10+15=25, B has 10+5=15, C has 20.
❓ data_output
intermediate1:30remaining
Number of groups created by groupby()
How many groups are created by this groupby operation?
Data Analysis Python
import pandas as pd data = {'Category': ['X', 'Y', 'X', 'Z', 'Y', 'Z', 'Z'], 'Value': [1, 2, 3, 4, 5, 6, 7]} df = pd.DataFrame(data) groups = df.groupby('Category') print(len(groups))
Attempts:
2 left
💡 Hint
Count unique categories in 'Category' column.
✗ Incorrect
The unique categories are X, Y, and Z, so 3 groups are created.
🔧 Debug
advanced2:00remaining
Error raised by incorrect groupby syntax
What error does this code raise?
Data Analysis Python
import pandas as pd data = {'Type': ['A', 'B', 'A'], 'Score': [5, 10, 15]} df = pd.DataFrame(data) result = df.groupby('Type', as_index=False).mean() print(result) # Now this incorrect code: result2 = df.groupby(Type).sum()
Attempts:
2 left
💡 Hint
Check if 'Type' is quoted or not in groupby argument.
✗ Incorrect
The second groupby uses Type without quotes, so Python looks for a variable named Type which is not defined, causing NameError.
🚀 Application
advanced2:00remaining
Using groupby() to find max value per group
Which option correctly finds the maximum 'Sales' value for each 'Region'?
Data Analysis Python
import pandas as pd data = {'Region': ['East', 'West', 'East', 'West', 'North'], 'Sales': [200, 150, 300, 100, 250]} df = pd.DataFrame(data)
Attempts:
2 left
💡 Hint
Group by 'Region' and select 'Sales' column before max.
✗ Incorrect
Option A groups by 'Region' and selects 'Sales' column, then finds max per group. Option A returns max for all columns, which works but includes 'Region' column max which is meaningless here. Option A is invalid syntax. Option A groups by 'Sales' which is wrong.
🧠 Conceptual
expert2:30remaining
Understanding groupby() with multiple aggregation functions
Given this code, what is the shape (rows, columns) of the resulting DataFrame?
Data Analysis Python
import pandas as pd data = {'Category': ['A', 'A', 'B', 'B', 'C'], 'Value': [1, 2, 3, 4, 5]} df = pd.DataFrame(data) result = df.groupby('Category').agg(['min', 'max'])
Attempts:
2 left
💡 Hint
Count unique groups and number of aggregation functions.
✗ Incorrect
There are 3 unique categories (A, B, C). Two aggregation functions ('min' and 'max') are applied to 'Value'. So rows=3 groups, columns=2 aggregations.