Challenge - 5 Problems

🎖️

Data Aggregation Master

Get all challenges correct to earn this badge!

Test your skills under time pressure!

❓ Predict Output

intermediate

2:00remaining

Output of groupby aggregation with multiple functions

What is the output of this code snippet that groups data by 'Category' and applies multiple aggregation functions?

Pandas

import pandas as pd

data = pd.DataFrame({
    'Category': ['A', 'A', 'B', 'B', 'C'],
    'Value': [10, 20, 10, 30, 40]
})

result = data.groupby('Category').agg({'Value': ['sum', 'mean']})
print(result)

          Value     
           sum  mean
Category            
A           30  15.0
B           40  20.0
C           40  40.0

          Value     
           mean  sum
Category            
A           15.0  30
B           20.0  40
C           40.0  40

          Value     
           sum  mean
Category            
A           30  20.0
B           40  10.0
C           40  40.0

          Value     
           sum  mean
Category            
A           20  15.0
B           40  20.0
C           40  40.0

Attempts:

2 left

❓ data_output

intermediate

2:00remaining

Number of groups after filtering and aggregation

Given this DataFrame, how many groups remain after grouping by 'Type' and filtering groups with sum of 'Score' > 50?

Pandas

import pandas as pd

data = pd.DataFrame({
    'Type': ['X', 'X', 'Y', 'Y', 'Z', 'Z'],
    'Score': [30, 25, 10, 15, 40, 20]
})

filtered = data.groupby('Type').filter(lambda x: x['Score'].sum() > 50)
groups = filtered.groupby('Type').ngroups
print(groups)

Attempts:

2 left

🔧 Debug

advanced

2:00remaining

Identify the error in aggregation code

What error does this code raise when trying to aggregate data?

Pandas

import pandas as pd

data = pd.DataFrame({
    'Group': ['A', 'A', 'B'],
    'Value': [1, 2, 3]
})

result = data.groupby('Group').agg({'Value': 'sum', 'NonExistent': 'mean'})
print(result)

AAttributeError: 'DataFrameGroupBy' object has no attribute 'agg'

BKeyError: 'NonExistent'

CValueError: No numeric types to aggregate

DTypeError: unsupported operand type(s) for +: 'int' and 'str'

Attempts:

2 left

❓ visualization

advanced

2:00remaining

Correct plot for aggregated data

Which option shows the correct bar plot code to visualize the sum of 'Sales' per 'Region' from this DataFrame?

Pandas

import pandas as pd
import matplotlib.pyplot as plt

data = pd.DataFrame({
    'Region': ['North', 'South', 'East', 'West', 'North', 'South'],
    'Sales': [100, 150, 200, 130, 120, 170]
})

agg_data = data.groupby('Region')['Sales'].sum()

plt.bar(agg_data, agg_data.index)
plt.show()

agg_data.plot(kind='bar')
plt.show()

plt.bar(agg_data.index, agg_data)
plt.show()

data.plot.bar(x='Region', y='Sales')
plt.show()

Attempts:

2 left

🚀 Application

expert

3:00remaining

Calculate weighted average after aggregation

Given this DataFrame, which code correctly calculates the weighted average 'Score' per 'Class' using 'Weight' as weights?

Pandas

import pandas as pd

data = pd.DataFrame({
    'Class': ['X', 'X', 'Y', 'Y', 'Y'],
    'Score': [80, 90, 70, 60, 75],
    'Weight': [1, 3, 2, 1, 2]
})

Adata.groupby('Class').apply(lambda x: (x['Score'] * x['Weight']).sum() / x['Weight'].sum())

Bdata.groupby('Class').agg({'Score': lambda x: (x * data['Weight']).mean()})

Cdata.groupby('Class').agg({'Score': 'mean'})

Ddata.groupby('Class').apply(lambda x: x['Score'].mean() * x['Weight'].mean())

Attempts:

2 left