Challenge - 5 Problems
Pivot Table Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
❓ Predict Output
intermediate2:00remaining
Output of a simple pivot_table()
What is the output of this code snippet using pandas pivot_table()?
Pandas
import pandas as pd data = {'City': ['NY', 'LA', 'NY', 'LA', 'NY'], 'Year': [2020, 2020, 2021, 2021, 2020], 'Sales': [100, 200, 150, 250, 300]} df = pd.DataFrame(data) result = pd.pivot_table(df, values='Sales', index='City', columns='Year', aggfunc='sum', fill_value=0) print(result)
Attempts:
2 left
💡 Hint
Think about how pivot_table sums Sales grouped by City and Year.
✗ Incorrect
The pivot_table sums Sales for each City and Year. LA has 200 in 2020 and 250 in 2021. NY has 100 + 300 = 400 in 2020 and 150 in 2021.
❓ data_output
intermediate1:30remaining
Number of items in pivot_table result
Given this pivot_table code, how many total cells (values) are in the resulting DataFrame?
Pandas
import pandas as pd data = {'Product': ['A', 'B', 'A', 'B', 'C'], 'Region': ['East', 'East', 'West', 'West', 'East'], 'Quantity': [10, 20, 15, 25, 30]} df = pd.DataFrame(data) pivot = pd.pivot_table(df, values='Quantity', index='Product', columns='Region', aggfunc='sum', fill_value=0) print(pivot)
Attempts:
2 left
💡 Hint
Count unique Products and Regions after pivot.
✗ Incorrect
There are 3 unique Products (A, B, C) and 2 unique Regions (East, West). The pivot table has 3 rows and 2 columns, so 3*2=6 cells.
🔧 Debug
advanced1:30remaining
Identify the error in pivot_table aggregation
What error will this code raise when run?
Pandas
import pandas as pd data = {'Category': ['X', 'Y', 'X'], 'Value': [5, 10, 15]} df = pd.DataFrame(data) pivot = pd.pivot_table(df, values='Value', index='Category', aggfunc='mean', columns='NonExistent') print(pivot)
Attempts:
2 left
💡 Hint
Check if the 'columns' argument refers to a valid DataFrame column.
✗ Incorrect
The 'columns' parameter is set to 'NonExistent', which is not a column in df. This causes a KeyError.
🚀 Application
advanced2:00remaining
Using pivot_table() to find average scores by group
You have a DataFrame with student scores. Which pivot_table call correctly computes the average score per subject and gender?
Pandas
import pandas as pd data = {'Student': ['Ann', 'Bob', 'Cathy', 'Dan'], 'Gender': ['F', 'M', 'F', 'M'], 'Subject': ['Math', 'Math', 'English', 'English'], 'Score': [90, 80, 85, 75]} df = pd.DataFrame(data)
Attempts:
2 left
💡 Hint
Average score per subject and gender means subject as rows, gender as columns, mean aggregation.
✗ Incorrect
Option A groups by Subject (rows) and Gender (columns) and calculates mean Score, which matches the requirement.
🧠 Conceptual
expert1:30remaining
Understanding fill_value in pivot_table()
What is the effect of the fill_value parameter in pandas pivot_table()?
Attempts:
2 left
💡 Hint
Think about what happens when some groups have no data for certain columns.
✗ Incorrect
fill_value replaces NaN values in the pivot table result with the given value, making the table easier to read and use.