Challenge - 5 Problems

🎖️

Pivot Table Master

Get all challenges correct to earn this badge!

Test your skills under time pressure!

❓ Predict Output

intermediate

2:00remaining

Output of a basic pivot_table aggregation

What is the output of this code snippet using pandas pivot_table?

Data Analysis Python

import pandas as pd

data = {'City': ['NY', 'LA', 'NY', 'LA', 'NY'],
        'Year': [2020, 2020, 2021, 2021, 2021],
        'Sales': [100, 200, 150, 250, 300]}
df = pd.DataFrame(data)

pivot = df.pivot_table(index='City', columns='Year', values='Sales', aggfunc='sum')
print(pivot)

Year  2020  2021
City             
LA     200.0  250.0
NY     100.0  450.0

Year  2020  2021
City             
LA     200.0  250.0
NY     100.0  300.0

Year  2020  2021
City             
LA     200.0  250.0
NY     100.0  150.0

Year  2020  2021
City             
LA     200.0  300.0
NY     100.0  450.0

Attempts:

2 left

❓ data_output

intermediate

2:00remaining

Number of items in pivot table result

How many cells (non-NaN) are in the resulting pivot table from this code?

Data Analysis Python

import pandas as pd

data = {'Product': ['A', 'B', 'A', 'B', 'C'],
        'Store': ['X', 'X', 'Y', 'Y', 'X'],
        'Quantity': [10, 20, 30, 40, 50]}
df = pd.DataFrame(data)

pivot = df.pivot_table(index='Product', columns='Store', values='Quantity', aggfunc='sum')

Attempts:

2 left

🔧 Debug

advanced

2:00remaining

Identify the error in pivot_table usage

What error does this code raise?

Data Analysis Python

import pandas as pd

data = {'Category': ['A', 'B', 'A'], 'Value': [1, 2, 3]}
df = pd.DataFrame(data)

pivot = df.pivot_table(index='Category', values='Value', aggfunc='mean', columns='NonExistent')

AKeyError: 'NonExistent'

BTypeError: aggfunc must be callable or string

CValueError: index and columns must be different

DNo error, outputs pivot table

Attempts:

2 left

🚀 Application

advanced

2:00remaining

Using pivot_table to find average ratings

Given this data, which pivot_table call produces the average rating per user per product?

Data Analysis Python

import pandas as pd

data = {'User': ['Alice', 'Bob', 'Alice', 'Bob', 'Alice'],
        'Product': ['X', 'X', 'Y', 'Y', 'X'],
df = pd.DataFrame(data)

pivot = df.pivot_table(index='User', columns='Product', values='Rating', aggfunc='mean')

Adf.pivot_table(index='User', columns='Product', values='Rating', aggfunc='count')

Bdf.pivot_table(index='Product', columns='User', values='Rating', aggfunc='sum')

Cdf.pivot_table(index='User', columns='Product', values='Rating', aggfunc='max')

Ddf.pivot_table(index='User', columns='Product', values='Rating', aggfunc='mean')

Attempts:

2 left

🧠 Conceptual

expert

2:00remaining

Understanding fill_value in pivot_table

What is the effect of using fill_value=0 in pivot_table?

AIt removes rows with any NaN values from the pivot table.

BIt replaces all NaN values in the pivot table with 0.

CIt fills missing values with the mean of the column.

DIt causes pivot_table to ignore missing values during aggregation.

Attempts:

2 left