0
0
Data Analysis Pythondata~20 mins

Aggregation-based features in Data Analysis Python - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
Aggregation Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
Predict Output
intermediate
2:00remaining
Output of group aggregation with multiple functions
What is the output of the following code snippet that groups data by 'Category' and calculates the mean and max of 'Value'?
Data Analysis Python
import pandas as pd

data = pd.DataFrame({
    'Category': ['A', 'A', 'B', 'B', 'C'],
    'Value': [10, 20, 10, 30, 40]
})

result = data.groupby('Category')['Value'].agg(['mean', 'max']).reset_index()
print(result)
A[{'Category': 'A', 'mean': 15.0, 'max': 20}, {'Category': 'B', 'mean': 20.0, 'max': 30}, {'Category': 'C', 'mean': 40.0, 'max': 40}]
B[{'Category': 'A', 'mean': 10.0, 'max': 20}, {'Category': 'B', 'mean': 20.0, 'max': 30}, {'Category': 'C', 'mean': 40.0, 'max': 40}]
C[{'Category': 'A', 'mean': 15.0, 'max': 10}, {'Category': 'B', 'mean': 20.0, 'max': 30}, {'Category': 'C', 'mean': 40.0, 'max': 40}]
D[{'Category': 'A', 'mean': 15.0, 'max': 20}, {'Category': 'B', 'mean': 20.0, 'max': 10}, {'Category': 'C', 'mean': 40.0, 'max': 40}]
Attempts:
2 left
💡 Hint
Remember that mean is the average and max is the highest value in each group.
data_output
intermediate
2:00remaining
Number of unique users per product
Given a DataFrame with user purchases, what is the number of unique users who bought each product?
Data Analysis Python
import pandas as pd

data = pd.DataFrame({
    'User': ['Alice', 'Bob', 'Alice', 'David', 'Bob', 'Eve'],
    'Product': ['X', 'X', 'Y', 'Y', 'X', 'Z']
})

result = data.groupby('Product')['User'].nunique().reset_index(name='UniqueUsers')
print(result)
A[{'Product': 'X', 'UniqueUsers': 2}, {'Product': 'Y', 'UniqueUsers': 3}, {'Product': 'Z', 'UniqueUsers': 1}]
B[{'Product': 'X', 'UniqueUsers': 3}, {'Product': 'Y', 'UniqueUsers': 2}, {'Product': 'Z', 'UniqueUsers': 1}]
C[{'Product': 'X', 'UniqueUsers': 2}, {'Product': 'Y', 'UniqueUsers': 2}, {'Product': 'Z', 'UniqueUsers': 1}]
D[{'Product': 'X', 'UniqueUsers': 3}, {'Product': 'Y', 'UniqueUsers': 3}, {'Product': 'Z', 'UniqueUsers': 1}]
Attempts:
2 left
💡 Hint
Count distinct users per product.
visualization
advanced
2:30remaining
Visualizing average sales per region
Which plot correctly shows the average sales per region from the given data?
Data Analysis Python
import pandas as pd
import matplotlib.pyplot as plt

data = pd.DataFrame({
    'Region': ['North', 'South', 'East', 'West', 'North', 'South'],
    'Sales': [100, 150, 200, 130, 120, 170]
})
avg_sales = data.groupby('Region')['Sales'].mean().reset_index()

plt.bar(avg_sales['Region'], avg_sales['Sales'])
plt.title('Average Sales per Region')
plt.xlabel('Region')
plt.ylabel('Average Sales')
plt.show()
ABar chart with regions on x-axis and average sales on y-axis, bars heights: North=110, South=160, East=200, West=130
BLine chart with regions on x-axis and average sales on y-axis, points: North=110, South=160, East=200, West=130
CPie chart showing percentage sales per region with values: North=220, South=320, East=200, West=130
DScatter plot with sales values plotted against region names
Attempts:
2 left
💡 Hint
Look for a bar chart showing average values per category.
🔧 Debug
advanced
2:00remaining
Identify the error in aggregation code
What error will the following code raise when executed?
Data Analysis Python
import pandas as pd

data = pd.DataFrame({
    'Category': ['A', 'B', 'A'],
    'Value': [10, 20, 30]
})

result = data.groupby('Category')['Value'].agg('sum', 'mean')
print(result)
ANo error, outputs sum and mean aggregation
BTypeError: agg() takes from 1 to 2 positional arguments but 3 were given
CKeyError: 'mean'
DValueError: Function names must be passed as a list or tuple
Attempts:
2 left
💡 Hint
Check how multiple aggregation functions are passed to agg().
🚀 Application
expert
3:00remaining
Calculate weighted average feature per group
You have a DataFrame with 'Group', 'Value', and 'Weight' columns. Which code correctly calculates the weighted average of 'Value' per 'Group' using 'Weight' as weights?
Data Analysis Python
import pandas as pd

data = pd.DataFrame({
    'Group': ['X', 'X', 'Y', 'Y', 'Y'],
    'Value': [10, 20, 30, 40, 50],
    'Weight': [1, 3, 2, 1, 1]
})
Adata.groupby('Group').agg({'Value': 'mean'}).reset_index()
Bdata.groupby('Group').apply(lambda x: x['Value'].mean() * x['Weight'].mean()).reset_index(name='WeightedAvg')
Cdata.groupby('Group').agg({'Value': lambda x: (x * data['Weight']).sum() / data['Weight'].sum()}).reset_index()
Ddata.groupby('Group').apply(lambda x: (x['Value'] * x['Weight']).sum() / x['Weight'].sum()).reset_index(name='WeightedAvg')
Attempts:
2 left
💡 Hint
Weighted average is sum of value*weight divided by sum of weights per group.