Challenge - 5 Problems

🎖️

Master of Grouping

Get all challenges correct to earn this badge!

Test your skills under time pressure!

❓ data_output

intermediate

2:00remaining

Output of single column grouping with aggregation

Given the following DataFrame, what is the output of grouping by the 'Category' column and calculating the sum of 'Sales'?

Data Analysis Python

import pandas as pd

data = {'Category': ['A', 'B', 'A', 'B', 'C'], 'Sales': [100, 200, 150, 300, 250]}
df = pd.DataFrame(data)
result = df.groupby('Category')['Sales'].sum()
print(result)

ACategory\nA 250\nB 500\nC 250\nName: Sales, dtype: int64

BCategory\nA 100\nB 200\nC 250\nName: Sales, dtype: int64

CCategory\nA 150\nB 300\nC 250\nName: Sales, dtype: int64

DCategory\nA 350\nB 500\nC 250\nName: Sales, dtype: int64

Attempts:

2 left

❓ data_output

intermediate

2:00remaining

Output of multiple column grouping with mean aggregation

What is the output of grouping the DataFrame by 'Category' and 'Region' columns and calculating the mean of 'Sales'?

Data Analysis Python

import pandas as pd

data = {'Category': ['A', 'A', 'B', 'B', 'C', 'C'], 'Region': ['East', 'West', 'East', 'West', 'East', 'West'], 'Sales': [100, 150, 200, 300, 250, 350]}
df = pd.DataFrame(data)
result = df.groupby(['Category', 'Region'])['Sales'].mean()
print(result)

ACategory Region\nA East 100\n West 150\nB East 200\n West 300\nC East 250\n West 350\nName: Sales, dtype: int64

BCategory Region\nA East 125.0\n West 225.0\nB East 250.0\n West 325.0\nC East 300.0\n West 400.0\nName: Sales, dtype: float64

CCategory Region\nA East 100.0\n West 150.0\nB East 200.0\n West 300.0\nC East 250.0\n West 350.0\nName: Sales, dtype: float64

DCategory Region\nA East 150.0\n West 100.0\nB East 300.0\n West 200.0\nC East 350.0\n West 250.0\nName: Sales, dtype: float64

Attempts:

2 left

🧠 Conceptual

advanced

2:00remaining

Understanding groupby object behavior

What happens if you try to access a column that was not included in the aggregation after grouping a DataFrame by a column?

Data Analysis Python

import pandas as pd

data = {'Category': ['A', 'B', 'A'], 'Sales': [100, 200, 150], 'Profit': [30, 50, 40]}
df = pd.DataFrame(data)
grouped = df.groupby('Category')['Sales'].sum()
print(grouped['Profit'])

AReturns the sum of 'Profit' for each category.

BRaises a KeyError because 'Profit' is not part of the grouped object.

CReturns the original 'Profit' column without grouping.

DReturns NaN values for 'Profit' since it was not aggregated.

Attempts:

2 left

🔧 Debug

advanced

2:00remaining

Identify the error in multiple column grouping code

What error will this code raise when grouping by multiple columns and aggregating?

Data Analysis Python

import pandas as pd

data = {'Category': ['A', 'B', 'A'], 'Region': ['East', 'West', 'East'], 'Sales': [100, 200, 150]}
df = pd.DataFrame(data)
result = df.groupby('Category', 'Region')['Sales'].sum()
print(result)

AKeyError: 'Region' not found

BNo error, outputs grouped sums correctly

CSyntaxError: invalid syntax

DTypeError: groupby() takes 2 positional arguments but 3 were given

Attempts:

2 left

🚀 Application

expert

3:00remaining

Calculate total sales and average profit by category and region

Given the DataFrame below, which code snippet correctly groups by 'Category' and 'Region' to calculate total 'Sales' and average 'Profit'?

Data Analysis Python

import pandas as pd

data = {'Category': ['A', 'A', 'B', 'B', 'C', 'C'], 'Region': ['East', 'West', 'East', 'West', 'East', 'West'], 'Sales': [100, 150, 200, 300, 250, 350], 'Profit': [20, 30, 40, 50, 60, 70]}
df = pd.DataFrame(data)

result = df.groupby(['Category', 'Region']).agg({'Sales': 'sum', 'Profit': 'mean'})
print(result)

result = df.groupby('Category', 'Region').agg({'Sales': 'sum', 'Profit': 'mean'})
print(result)

result = df.groupby(['Category', 'Region'])[['Sales', 'Profit']].agg(['sum', 'mean'])
print(result)

result = df.groupby(['Category', 'Region']).agg({'Sales': 'mean', 'Profit': 'sum'})
print(result)

Attempts:

2 left