Challenge - 5 Problems

🎖️

Long to Wide Format Master

Get all challenges correct to earn this badge!

Test your skills under time pressure!

❓ Predict Output

intermediate

2:00remaining

Output of pivot operation on a simple DataFrame

Given the following pandas DataFrame in long format, what is the output after applying pivot to convert it to wide format?

Pandas

import pandas as pd

df = pd.DataFrame({
    'Date': ['2023-01-01', '2023-01-01', '2023-01-02', '2023-01-02'],
    'City': ['NY', 'LA', 'NY', 'LA'],
    'Temperature': [30, 60, 28, 65]
})

wide_df = df.pivot(index='Date', columns='City', values='Temperature')
print(wide_df)

City        NY    LA
Date                
2023-01-01  30.0  60.0
2023-01-02  28.0  65.0

City        LA    NY
Date                
2023-01-01  60.0  30.0
2023-01-02  65.0  28.0

Date        2023-01-01  2023-01-02
City
LA               60.0       65.0
NY               30.0       28.0

Date        2023-01-01  2023-01-02
City
NY               30.0       28.0
LA               60.0       65.0

Attempts:

2 left

❓ data_output

intermediate

2:00remaining

Number of columns after pivot with multiple value columns

Consider this DataFrame with multiple measurement columns in long format. After pivoting with pivot using index='Date' and columns='City', how many columns will the resulting wide DataFrame have?

Pandas

import pandas as pd

df = pd.DataFrame({
    'Date': ['2023-01-01', '2023-01-01', '2023-01-02', '2023-01-02'],
    'City': ['NY', 'LA', 'NY', 'LA'],
    'Temperature': [30, 60, 28, 65],
    'Humidity': [55, 40, 60, 42]
})

# Attempt to pivot
# wide_df = df.pivot(index='Date', columns='City')  # What happens?

A4 columns

BRaises a ValueError due to multiple values for one index/column pair

C2 columns

D1 column

Attempts:

2 left

🚀 Application

advanced

2:30remaining

Using pivot_table to handle duplicates in long to wide conversion

You have this long format DataFrame with duplicate entries for the same Date and City. Which code snippet correctly converts it to wide format by averaging duplicates?

Pandas

import pandas as pd

df = pd.DataFrame({
    'Date': ['2023-01-01', '2023-01-01', '2023-01-01', '2023-01-02'],
    'City': ['NY', 'NY', 'LA', 'LA'],
    'Temperature': [30, 32, 60, 65]
})

Awide_df = df.pivot_table(index='Date', columns='City', values='Temperature', aggfunc='mean')

Bwide_df = df.pivot_table(index='Date', columns='City', values='Temperature', aggfunc='sum')

Cwide_df = df.pivot(index='Date', columns='City', values='Temperature')

Dwide_df = df.pivot(index='Date', columns='City')

Attempts:

2 left

🔧 Debug

advanced

2:00remaining

Identify the error in this pivot code

What error will this code raise when trying to convert the long DataFrame to wide format?

Pandas

import pandas as pd

df = pd.DataFrame({
    'Date': ['2023-01-01', '2023-01-02'],
    'City': ['NY', 'LA'],
    'Temperature': [30, 60]
})

wide_df = df.pivot(index='City', columns='Date')
print(wide_df)

ATypeError: pivot() missing required argument 'values'

BValueError: Index contains duplicate entries, cannot reshape

CNo error, prints wide DataFrame

DKeyError: 'Temperature'

Attempts:

2 left

🧠 Conceptual

expert

3:00remaining

Understanding the difference between pivot and pivot_table

Which statement correctly explains the difference between pivot and pivot_table in pandas when converting long to wide format?

A<code>pivot</code> can handle duplicate entries by aggregating them, while <code>pivot_table</code> cannot.

B<code>pivot</code> automatically fills missing values with zeros, while <code>pivot_table</code> leaves them as NaN.

C<code>pivot</code> and <code>pivot_table</code> are identical in functionality and usage.

D<code>pivot_table</code> allows aggregation of duplicate entries using functions like mean or sum, while <code>pivot</code> raises an error if duplicates exist.

Attempts:

2 left