Challenge - 5 Problems
Groupby Mastery Badge
Get all challenges correct to earn this badge!
Test your skills under time pressure!
❓ Predict Output
intermediate2:00remaining
Output of groupby sum aggregation
What is the output of this code?
import pandas as pd
data = {'Team': ['A', 'A', 'B', 'B', 'C'], 'Points': [10, 15, 10, 5, 20]}
df = pd.DataFrame(data)
result = df.groupby('Team').sum()Pandas
import pandas as pd data = {'Team': ['A', 'A', 'B', 'B', 'C'], 'Points': [10, 15, 10, 5, 20]} df = pd.DataFrame(data) result = df.groupby('Team').sum() print(result)
Attempts:
2 left
💡 Hint
Remember groupby groups rows by the 'Team' column and sums the 'Points' for each group.
✗ Incorrect
The groupby('Team') groups rows by team names. Then sum() adds the Points in each group. So A has 10+15=25, B has 10+5=15, and C has 20.
❓ data_output
intermediate1:30remaining
Number of groups created by groupby
Given this DataFrame, how many groups will be created by grouping on 'Category' column?
import pandas as pd
data = {'Category': ['X', 'Y', 'X', 'Z', 'Y', 'Z', 'Z'], 'Value': [1, 2, 3, 4, 5, 6, 7]}
df = pd.DataFrame(data)
groups = df.groupby('Category')Pandas
import pandas as pd data = {'Category': ['X', 'Y', 'X', 'Z', 'Y', 'Z', 'Z'], 'Value': [1, 2, 3, 4, 5, 6, 7]} df = pd.DataFrame(data) groups = df.groupby('Category') print(len(groups))
Attempts:
2 left
💡 Hint
Count unique values in the 'Category' column.
✗ Incorrect
The unique categories are X, Y, and Z, so 3 groups are created.
🔧 Debug
advanced1:30remaining
Identify the error in groupby usage
What error does this code raise?
import pandas as pd
data = {'Name': ['Anna', 'Bob', 'Cara'], 'Score': [90, 80, 85]}
df = pd.DataFrame(data)
result = df.groupby('Age').mean()Pandas
import pandas as pd data = {'Name': ['Anna', 'Bob', 'Cara'], 'Score': [90, 80, 85]} df = pd.DataFrame(data) result = df.groupby('Age').mean()
Attempts:
2 left
💡 Hint
Check if the column used in groupby exists in the DataFrame.
✗ Incorrect
The DataFrame has no 'Age' column, so groupby('Age') raises a KeyError.
🚀 Application
advanced2:00remaining
Calculate average sales per region
You have sales data:
Which code correctly calculates the average sales per region?
import pandas as pd
data = {'Region': ['North', 'South', 'North', 'East', 'South', 'East'], 'Sales': [100, 200, 150, 300, 250, 350]}
df = pd.DataFrame(data)Which code correctly calculates the average sales per region?
Pandas
import pandas as pd data = {'Region': ['North', 'South', 'North', 'East', 'South', 'East'], 'Sales': [100, 200, 150, 300, 250, 350]} df = pd.DataFrame(data)
Attempts:
2 left
💡 Hint
Group by 'Region' and then calculate mean of 'Sales'.
✗ Incorrect
Option D groups by 'Region' and calculates mean sales correctly. Option D groups by sales values, which is wrong. Option D divides sum by total count, not per group. Option D counts rows per region, not average sales.
🧠 Conceptual
expert1:30remaining
Understanding groupby object behavior
Which statement about a pandas groupby object is TRUE?
Attempts:
2 left
💡 Hint
Think about what happens when you loop over a groupby object.
✗ Incorrect
A groupby object is lazy and does not compute aggregations until asked. It is iterable and yields pairs of group name and group DataFrame. It cannot be indexed like a DataFrame. It does not store groups as a list but as a special object.