Challenge - 5 Problems
Grouping Mastery Badge
Get all challenges correct to earn this badge!
Test your skills under time pressure!
❓ Predict Output
intermediate2:00remaining
Output of grouping by two columns and summing
What is the output of this code snippet that groups data by two columns and sums the values?
Pandas
import pandas as pd data = pd.DataFrame({ 'City': ['NY', 'LA', 'NY', 'LA', 'NY'], 'Category': ['A', 'A', 'B', 'B', 'A'], 'Sales': [100, 200, 150, 300, 50] }) result = data.groupby(['City', 'Category']).sum() print(result)
Attempts:
2 left
💡 Hint
Think about how groupby sums values for each unique pair of City and Category.
✗ Incorrect
The groupby sums Sales for each City and Category pair. For LA and B, sales is 300 only, not 450. For NY and A, sales is 100 + 50 = 150.
❓ data_output
intermediate1:30remaining
Number of groups formed by grouping on two columns
Given this DataFrame, how many groups will be formed when grouping by 'Type' and 'Color'?
Pandas
import pandas as pd data = pd.DataFrame({ 'Type': ['Fruit', 'Fruit', 'Vegetable', 'Fruit', 'Vegetable'], 'Color': ['Red', 'Green', 'Green', 'Red', 'Red'], 'Quantity': [10, 15, 7, 5, 3] }) groups = data.groupby(['Type', 'Color']) print(len(groups))
Attempts:
2 left
💡 Hint
Count unique pairs of Type and Color in the data.
✗ Incorrect
The unique pairs are: (Fruit, Red), (Fruit, Green), (Vegetable, Green), (Vegetable, Red). So 4 groups.
🔧 Debug
advanced2:00remaining
Identify the error in grouping by multiple columns
What error will this code raise when trying to group by multiple columns?
Pandas
import pandas as pd data = pd.DataFrame({ 'A': [1, 2, 1], 'B': [3, 4, 3], 'C': [5, 6, 7] }) result = data.groupby('A', 'B').sum() print(result)
Attempts:
2 left
💡 Hint
Check how groupby expects multiple columns as input.
✗ Incorrect
groupby expects a list or array-like for multiple columns, not multiple positional arguments.
❓ visualization
advanced2:30remaining
Visualizing grouped data by multiple columns
Which option correctly creates a bar plot showing the sum of 'Value' grouped by 'Group' and 'Subgroup'?
Pandas
import pandas as pd import matplotlib.pyplot as plt data = pd.DataFrame({ 'Group': ['X', 'X', 'Y', 'Y', 'X'], 'Subgroup': ['A', 'B', 'A', 'B', 'A'], 'Value': [10, 20, 30, 40, 50] }) grouped = data.groupby(['Group', 'Subgroup']).sum() # Which code below plots the grouped data correctly?
Attempts:
2 left
💡 Hint
Think about how to reshape grouped data for a clear bar plot with multiple categories.
✗ Incorrect
unstack() reshapes the grouped data so that Subgroup becomes columns, allowing a grouped bar plot.
🚀 Application
expert3:00remaining
Calculate weighted average after grouping by multiple columns
You have this DataFrame with 'Category', 'Subcategory', 'Score', and 'Weight'. How do you calculate the weighted average Score for each Category and Subcategory group?
Pandas
import pandas as pd data = pd.DataFrame({ 'Category': ['A', 'A', 'B', 'B', 'A'], 'Subcategory': ['X', 'X', 'Y', 'Y', 'Z'], 'Score': [80, 90, 70, 60, 85], 'Weight': [1, 2, 1, 3, 2] })
Attempts:
2 left
💡 Hint
Weighted average is sum of (value * weight) divided by sum of weights.
✗ Incorrect
Option A correctly applies weighted average calculation per group using apply and lambda.