Challenge - 5 Problems
Master of Grouping
Get all challenges correct to earn this badge!
Test your skills under time pressure!
❓ data_output
intermediate2:00remaining
Output of single column grouping with aggregation
Given the following DataFrame, what is the output of grouping by the 'Category' column and calculating the sum of 'Sales'?
Data Analysis Python
import pandas as pd data = {'Category': ['A', 'B', 'A', 'B', 'C'], 'Sales': [100, 200, 150, 300, 250]} df = pd.DataFrame(data) result = df.groupby('Category')['Sales'].sum() print(result)
Attempts:
2 left
💡 Hint
Sum the 'Sales' values for each unique 'Category'.
✗ Incorrect
The sum of sales for Category A is 100 + 150 = 250, for B is 200 + 300 = 500, and for C is 250.
❓ data_output
intermediate2:00remaining
Output of multiple column grouping with mean aggregation
What is the output of grouping the DataFrame by 'Category' and 'Region' columns and calculating the mean of 'Sales'?
Data Analysis Python
import pandas as pd data = {'Category': ['A', 'A', 'B', 'B', 'C', 'C'], 'Region': ['East', 'West', 'East', 'West', 'East', 'West'], 'Sales': [100, 150, 200, 300, 250, 350]} df = pd.DataFrame(data) result = df.groupby(['Category', 'Region'])['Sales'].mean() print(result)
Attempts:
2 left
💡 Hint
Calculate the average sales for each combination of Category and Region.
✗ Incorrect
Each group has one value, so the mean is the value itself.
🧠 Conceptual
advanced2:00remaining
Understanding groupby object behavior
What happens if you try to access a column that was not included in the aggregation after grouping a DataFrame by a column?
Data Analysis Python
import pandas as pd data = {'Category': ['A', 'B', 'A'], 'Sales': [100, 200, 150], 'Profit': [30, 50, 40]} df = pd.DataFrame(data) grouped = df.groupby('Category')['Sales'].sum() print(grouped['Profit'])
Attempts:
2 left
💡 Hint
Check if the grouped object contains the 'Profit' column.
✗ Incorrect
After grouping and selecting 'Sales', the grouped object only contains 'Sales'. Accessing 'Profit' raises KeyError.
🔧 Debug
advanced2:00remaining
Identify the error in multiple column grouping code
What error will this code raise when grouping by multiple columns and aggregating?
Data Analysis Python
import pandas as pd data = {'Category': ['A', 'B', 'A'], 'Region': ['East', 'West', 'East'], 'Sales': [100, 200, 150]} df = pd.DataFrame(data) result = df.groupby('Category', 'Region')['Sales'].sum() print(result)
Attempts:
2 left
💡 Hint
Check the syntax of the groupby method arguments.
✗ Incorrect
groupby expects a single argument for columns as a list or array, not multiple positional arguments.
🚀 Application
expert3:00remaining
Calculate total sales and average profit by category and region
Given the DataFrame below, which code snippet correctly groups by 'Category' and 'Region' to calculate total 'Sales' and average 'Profit'?
Data Analysis Python
import pandas as pd data = {'Category': ['A', 'A', 'B', 'B', 'C', 'C'], 'Region': ['East', 'West', 'East', 'West', 'East', 'West'], 'Sales': [100, 150, 200, 300, 250, 350], 'Profit': [20, 30, 40, 50, 60, 70]} df = pd.DataFrame(data)
Attempts:
2 left
💡 Hint
Use a dictionary to specify different aggregation functions for each column.
✗ Incorrect
Option A correctly groups by both columns and applies sum to Sales and mean to Profit. Option A has wrong groupby syntax. Option A uses deprecated syntax for selecting multiple columns. Option A swaps aggregation functions.