0
0
Data Analysis Pythondata~20 mins

Single and multiple column grouping in Data Analysis Python - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
Master of Grouping
Get all challenges correct to earn this badge!
Test your skills under time pressure!
data_output
intermediate
2:00remaining
Output of single column grouping with aggregation
Given the following DataFrame, what is the output of grouping by the 'Category' column and calculating the sum of 'Sales'?
Data Analysis Python
import pandas as pd

data = {'Category': ['A', 'B', 'A', 'B', 'C'], 'Sales': [100, 200, 150, 300, 250]}
df = pd.DataFrame(data)
result = df.groupby('Category')['Sales'].sum()
print(result)
ACategory\nA 250\nB 500\nC 250\nName: Sales, dtype: int64
BCategory\nA 100\nB 200\nC 250\nName: Sales, dtype: int64
CCategory\nA 150\nB 300\nC 250\nName: Sales, dtype: int64
DCategory\nA 350\nB 500\nC 250\nName: Sales, dtype: int64
Attempts:
2 left
💡 Hint
Sum the 'Sales' values for each unique 'Category'.
data_output
intermediate
2:00remaining
Output of multiple column grouping with mean aggregation
What is the output of grouping the DataFrame by 'Category' and 'Region' columns and calculating the mean of 'Sales'?
Data Analysis Python
import pandas as pd

data = {'Category': ['A', 'A', 'B', 'B', 'C', 'C'], 'Region': ['East', 'West', 'East', 'West', 'East', 'West'], 'Sales': [100, 150, 200, 300, 250, 350]}
df = pd.DataFrame(data)
result = df.groupby(['Category', 'Region'])['Sales'].mean()
print(result)
ACategory Region\nA East 100\n West 150\nB East 200\n West 300\nC East 250\n West 350\nName: Sales, dtype: int64
BCategory Region\nA East 125.0\n West 225.0\nB East 250.0\n West 325.0\nC East 300.0\n West 400.0\nName: Sales, dtype: float64
CCategory Region\nA East 100.0\n West 150.0\nB East 200.0\n West 300.0\nC East 250.0\n West 350.0\nName: Sales, dtype: float64
DCategory Region\nA East 150.0\n West 100.0\nB East 300.0\n West 200.0\nC East 350.0\n West 250.0\nName: Sales, dtype: float64
Attempts:
2 left
💡 Hint
Calculate the average sales for each combination of Category and Region.
🧠 Conceptual
advanced
2:00remaining
Understanding groupby object behavior
What happens if you try to access a column that was not included in the aggregation after grouping a DataFrame by a column?
Data Analysis Python
import pandas as pd

data = {'Category': ['A', 'B', 'A'], 'Sales': [100, 200, 150], 'Profit': [30, 50, 40]}
df = pd.DataFrame(data)
grouped = df.groupby('Category')['Sales'].sum()
print(grouped['Profit'])
AReturns the sum of 'Profit' for each category.
BRaises a KeyError because 'Profit' is not part of the grouped object.
CReturns the original 'Profit' column without grouping.
DReturns NaN values for 'Profit' since it was not aggregated.
Attempts:
2 left
💡 Hint
Check if the grouped object contains the 'Profit' column.
🔧 Debug
advanced
2:00remaining
Identify the error in multiple column grouping code
What error will this code raise when grouping by multiple columns and aggregating?
Data Analysis Python
import pandas as pd

data = {'Category': ['A', 'B', 'A'], 'Region': ['East', 'West', 'East'], 'Sales': [100, 200, 150]}
df = pd.DataFrame(data)
result = df.groupby('Category', 'Region')['Sales'].sum()
print(result)
AKeyError: 'Region' not found
BNo error, outputs grouped sums correctly
CSyntaxError: invalid syntax
DTypeError: groupby() takes 2 positional arguments but 3 were given
Attempts:
2 left
💡 Hint
Check the syntax of the groupby method arguments.
🚀 Application
expert
3:00remaining
Calculate total sales and average profit by category and region
Given the DataFrame below, which code snippet correctly groups by 'Category' and 'Region' to calculate total 'Sales' and average 'Profit'?
Data Analysis Python
import pandas as pd

data = {'Category': ['A', 'A', 'B', 'B', 'C', 'C'], 'Region': ['East', 'West', 'East', 'West', 'East', 'West'], 'Sales': [100, 150, 200, 300, 250, 350], 'Profit': [20, 30, 40, 50, 60, 70]}
df = pd.DataFrame(data)
A
result = df.groupby(['Category', 'Region']).agg({'Sales': 'sum', 'Profit': 'mean'})
print(result)
B
result = df.groupby('Category', 'Region').agg({'Sales': 'sum', 'Profit': 'mean'})
print(result)
C
result = df.groupby(['Category', 'Region'])[['Sales', 'Profit']].agg(['sum', 'mean'])
print(result)
D
result = df.groupby(['Category', 'Region']).agg({'Sales': 'mean', 'Profit': 'sum'})
print(result)
Attempts:
2 left
💡 Hint
Use a dictionary to specify different aggregation functions for each column.