Challenge - 5 Problems
Aggregation Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
❓ Predict Output
intermediate2:00remaining
Output of sum aggregation on DataFrame column
What is the output of the following Python code using pandas?
Data Analysis Python
import pandas as pd df = pd.DataFrame({'values': [10, 20, 30, 40]}) result = df['values'].sum() print(result)
Attempts:
2 left
💡 Hint
Sum adds all numbers in the column.
✗ Incorrect
The sum of 10 + 20 + 30 + 40 is 100.
❓ data_output
intermediate2:00remaining
Count of non-null entries in DataFrame column
Given this DataFrame, what is the output of the count aggregation?
Data Analysis Python
import pandas as pd df = pd.DataFrame({'scores': [5, None, 7, None, 9]}) result = df['scores'].count() print(result)
Attempts:
2 left
💡 Hint
Count counts only non-null values.
✗ Incorrect
There are 3 non-null values: 5, 7, and 9.
❓ Predict Output
advanced2:00remaining
Mean aggregation with missing values
What is the output of this code calculating the mean?
Data Analysis Python
import pandas as pd df = pd.DataFrame({'temps': [20, 25, None, 30]}) result = df['temps'].mean() print(round(result, 2))
Attempts:
2 left
💡 Hint
Mean ignores missing values by default.
✗ Incorrect
The mean is (20 + 25 + 30) / 3 = 25.0.
❓ visualization
advanced3:00remaining
Bar chart of sum aggregation by group
Which option shows the correct bar chart output for sum of sales by region?
Data Analysis Python
import pandas as pd import matplotlib.pyplot as plt data = {'region': ['East', 'West', 'East', 'West'], 'sales': [100, 200, 150, 300]} df = pd.DataFrame(data) sum_sales = df.groupby('region')['sales'].sum() sum_sales.plot(kind='bar') plt.show()
Attempts:
2 left
💡 Hint
Group sums add sales per region.
✗ Incorrect
East sales: 100 + 150 = 250; West sales: 200 + 300 = 500.
🧠 Conceptual
expert3:00remaining
Understanding count vs size in groupby
In pandas, what is the difference between using
count() and size() after a groupby operation?Attempts:
2 left
💡 Hint
Think about how missing data affects counts.
✗ Incorrect
count() ignores nulls and counts only non-null entries per group, while size() counts all rows regardless of nulls.