Challenge - 5 Problems
Pivot Table Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
❓ Predict Output
intermediate2:00remaining
Output of a basic pivot_table aggregation
What is the output of this code snippet using pandas pivot_table?
Data Analysis Python
import pandas as pd data = {'City': ['NY', 'LA', 'NY', 'LA', 'NY'], 'Year': [2020, 2020, 2021, 2021, 2021], 'Sales': [100, 200, 150, 250, 300]} df = pd.DataFrame(data) pivot = df.pivot_table(index='City', columns='Year', values='Sales', aggfunc='sum') print(pivot)
Attempts:
2 left
💡 Hint
Sum sales grouped by City and Year.
✗ Incorrect
The pivot_table sums Sales for each City and Year. NY in 2021 has two entries: 150 and 300, summed to 450.
❓ data_output
intermediate2:00remaining
Number of items in pivot table result
How many cells (non-NaN) are in the resulting pivot table from this code?
Data Analysis Python
import pandas as pd data = {'Product': ['A', 'B', 'A', 'B', 'C'], 'Store': ['X', 'X', 'Y', 'Y', 'X'], 'Quantity': [10, 20, 30, 40, 50]} df = pd.DataFrame(data) pivot = df.pivot_table(index='Product', columns='Store', values='Quantity', aggfunc='sum')
Attempts:
2 left
💡 Hint
Count cells with actual sums, ignoring missing combinations.
✗ Incorrect
The pivot table has Products A, B, C (rows) and Stores X, Y (columns). A-X, A-Y, B-X, B-Y, C-X have values (5 non-NaN cells); C-Y is NaN.
🔧 Debug
advanced2:00remaining
Identify the error in pivot_table usage
What error does this code raise?
Data Analysis Python
import pandas as pd data = {'Category': ['A', 'B', 'A'], 'Value': [1, 2, 3]} df = pd.DataFrame(data) pivot = df.pivot_table(index='Category', values='Value', aggfunc='mean', columns='NonExistent')
Attempts:
2 left
💡 Hint
Check if the columns argument exists in DataFrame.
✗ Incorrect
The 'columns' parameter refers to a column not in the DataFrame, causing KeyError.
🚀 Application
advanced2:00remaining
Using pivot_table to find average ratings
Given this data, which pivot_table call produces the average rating per user per product?
Data Analysis Python
import pandas as pd data = {'User': ['Alice', 'Bob', 'Alice', 'Bob', 'Alice'], 'Product': ['X', 'X', 'Y', 'Y', 'X'], df = pd.DataFrame(data) pivot = df.pivot_table(index='User', columns='Product', values='Rating', aggfunc='mean')
Attempts:
2 left
💡 Hint
Look for average rating per user per product.
✗ Incorrect
Option D groups by User and Product and calculates mean rating, matching the requirement.
🧠 Conceptual
expert2:00remaining
Understanding fill_value in pivot_table
What is the effect of using fill_value=0 in pivot_table?
Attempts:
2 left
💡 Hint
Think about how missing data is handled in the output.
✗ Incorrect
fill_value replaces missing values (NaN) in the pivot table with the specified value, here 0.