Challenge - 5 Problems

🎖️

Pivot Table Master

Get all challenges correct to earn this badge!

Test your skills under time pressure!

❓ Predict Output

intermediate

2:00remaining

Output of a simple pivot_table()

What is the output of this code snippet using pandas pivot_table()?

Pandas

import pandas as pd

data = {'City': ['NY', 'LA', 'NY', 'LA', 'NY'],
        'Year': [2020, 2020, 2021, 2021, 2020],
        'Sales': [100, 200, 150, 250, 300]}
df = pd.DataFrame(data)

result = pd.pivot_table(df, values='Sales', index='City', columns='Year', aggfunc='sum', fill_value=0)
print(result)

Year  2020  2021
City           
LA      200   250
NY      300   150

Year  2020  2021
City           
LA      100   250
NY      400   150

Year  2020  2021
City           
LA      200   150
NY      300   250

Year  2020  2021
City           
LA      200   250
NY      400   150

Attempts:

2 left

❓ data_output

intermediate

1:30remaining

Number of items in pivot_table result

Given this pivot_table code, how many total cells (values) are in the resulting DataFrame?

Pandas

import pandas as pd

data = {'Product': ['A', 'B', 'A', 'B', 'C'],
        'Region': ['East', 'East', 'West', 'West', 'East'],
        'Quantity': [10, 20, 15, 25, 30]}
df = pd.DataFrame(data)

pivot = pd.pivot_table(df, values='Quantity', index='Product', columns='Region', aggfunc='sum', fill_value=0)
print(pivot)

Attempts:

2 left

🔧 Debug

advanced

1:30remaining

Identify the error in pivot_table aggregation

What error will this code raise when run?

Pandas

import pandas as pd

data = {'Category': ['X', 'Y', 'X'], 'Value': [5, 10, 15]}
df = pd.DataFrame(data)

pivot = pd.pivot_table(df, values='Value', index='Category', aggfunc='mean', columns='NonExistent')
print(pivot)

AValueError: No numeric types to aggregate

BKeyError: 'NonExistent'

CTypeError: aggfunc must be callable or string

DNo error, prints pivot table with NaN values

Attempts:

2 left

🚀 Application

advanced

2:00remaining

Using pivot_table() to find average scores by group

You have a DataFrame with student scores. Which pivot_table call correctly computes the average score per subject and gender?

Pandas

import pandas as pd

data = {'Student': ['Ann', 'Bob', 'Cathy', 'Dan'],
        'Gender': ['F', 'M', 'F', 'M'],
        'Subject': ['Math', 'Math', 'English', 'English'],
        'Score': [90, 80, 85, 75]}
df = pd.DataFrame(data)

Apd.pivot_table(df, values='Score', index='Subject', columns='Gender', aggfunc='mean')

Bpd.pivot_table(df, values='Score', index='Gender', columns='Subject', aggfunc='sum')

Cpd.pivot_table(df, values='Score', index='Student', columns='Subject', aggfunc='mean')

Dpd.pivot_table(df, values='Score', index='Subject', columns='Gender', aggfunc='sum')

Attempts:

2 left

🧠 Conceptual

expert

1:30remaining

Understanding fill_value in pivot_table()

What is the effect of the fill_value parameter in pandas pivot_table()?

AIt changes the aggregation function to fill missing values with fill_value.

BIt filters out rows with missing values before aggregation.

CIt replaces missing values in the pivot table with the specified fill_value.

DIt renames the columns with the fill_value string.

Attempts:

2 left