Challenge - 5 Problems
Groupby Mastery
Get all challenges correct to earn this badge!
Test your skills under time pressure!
❓ Predict Output
intermediate2:00remaining
Output of groupby sum aggregation
What is the output of this code that groups sales by product category and sums the sales?
Data Analysis Python
import pandas as pd data = {'Category': ['Fruit', 'Fruit', 'Vegetable', 'Vegetable', 'Fruit'], 'Sales': [10, 15, 7, 8, 5]} df = pd.DataFrame(data) result = df.groupby('Category')['Sales'].sum() print(result)
Attempts:
2 left
💡 Hint
Sum adds all sales values within each category.
✗ Incorrect
The groupby groups rows by 'Category'. Then sum adds all 'Sales' values in each group. Fruit has 10+15+5=30, Vegetable has 7+8=15.
❓ data_output
intermediate1:30remaining
Number of groups created by groupby
How many groups are created when grouping this data by 'Type'?
Data Analysis Python
import pandas as pd data = {'Type': ['A', 'B', 'A', 'C', 'B', 'C', 'A'], 'Value': [1, 2, 3, 4, 5, 6, 7]} df = pd.DataFrame(data) groups = df.groupby('Type') print(len(groups))
Attempts:
2 left
💡 Hint
Count unique values in 'Type' column.
✗ Incorrect
There are three unique types: A, B, and C, so groupby creates 3 groups.
🔧 Debug
advanced2:00remaining
Identify the error in groupby usage
What error does this code raise when trying to group by a non-existent column?
Data Analysis Python
import pandas as pd data = {'Category': ['X', 'Y', 'X'], 'Value': [1, 2, 3]} df = pd.DataFrame(data) result = df.groupby('Type')['Value'].sum() print(result)
Attempts:
2 left
💡 Hint
Check if the column name exists in the DataFrame.
✗ Incorrect
The code tries to group by 'Type' which is not a column in df, causing KeyError.
❓ visualization
advanced2:30remaining
Visualizing grouped data with bar plot
Which option produces a bar plot showing total sales per region after grouping?
Data Analysis Python
import pandas as pd import matplotlib.pyplot as plt data = {'Region': ['North', 'South', 'North', 'East', 'South'], 'Sales': [100, 150, 200, 130, 170]} df = pd.DataFrame(data) grouped = df.groupby('Region')['Sales'].sum() # Which code below plots grouped data correctly?
Attempts:
2 left
💡 Hint
Use the grouped Series plot method with kind='bar'.
✗ Incorrect
Option C calls plot on the grouped Series with kind='bar', producing correct bar chart. Option C plots raw data, not grouped sums. Option C is valid syntax and also works. Option C passes Series directly to plt.bar causing error.
🧠 Conceptual
expert1:30remaining
Why does groupby summarize data by category?
Why is grouping data by category useful in data analysis?
Attempts:
2 left
💡 Hint
Think about what grouping helps you do with data.
✗ Incorrect
Grouping lets you split data into categories so you can compute sums, averages, counts, and other summaries per group. This helps understand patterns by category.