0
0
Pandasdata~20 mins

Long to wide format conversion in Pandas - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
Long to Wide Format Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
Predict Output
intermediate
2:00remaining
Output of pivot operation on a simple DataFrame
Given the following pandas DataFrame in long format, what is the output after applying pivot to convert it to wide format?
Pandas
import pandas as pd

df = pd.DataFrame({
    'Date': ['2023-01-01', '2023-01-01', '2023-01-02', '2023-01-02'],
    'City': ['NY', 'LA', 'NY', 'LA'],
    'Temperature': [30, 60, 28, 65]
})

wide_df = df.pivot(index='Date', columns='City', values='Temperature')
print(wide_df)
A
City        NY    LA
Date                
2023-01-01  30.0  60.0
2023-01-02  28.0  65.0
B
City        LA    NY
Date                
2023-01-01  60.0  30.0
2023-01-02  65.0  28.0
C
Date        2023-01-01  2023-01-02
City
LA               60.0       65.0
NY               30.0       28.0
D
Date        2023-01-01  2023-01-02
City
NY               30.0       28.0
LA               60.0       65.0
Attempts:
2 left
💡 Hint
Remember that pivot uses the index as rows and columns as columns in the new DataFrame.
data_output
intermediate
2:00remaining
Number of columns after pivot with multiple value columns
Consider this DataFrame with multiple measurement columns in long format. After pivoting with pivot using index='Date' and columns='City', how many columns will the resulting wide DataFrame have?
Pandas
import pandas as pd

df = pd.DataFrame({
    'Date': ['2023-01-01', '2023-01-01', '2023-01-02', '2023-01-02'],
    'City': ['NY', 'LA', 'NY', 'LA'],
    'Temperature': [30, 60, 28, 65],
    'Humidity': [55, 40, 60, 42]
})

# Attempt to pivot
# wide_df = df.pivot(index='Date', columns='City')  # What happens?
A4 columns
BRaises a ValueError due to multiple values for one index/column pair
C2 columns
D1 column
Attempts:
2 left
💡 Hint
When no `values` is specified, `pivot` uses all remaining columns (`Temperature`, `Humidity`), creating MultiIndex columns: 2 × 2 cities = 4 columns.
🚀 Application
advanced
2:30remaining
Using pivot_table to handle duplicates in long to wide conversion
You have this long format DataFrame with duplicate entries for the same Date and City. Which code snippet correctly converts it to wide format by averaging duplicates?
Pandas
import pandas as pd

df = pd.DataFrame({
    'Date': ['2023-01-01', '2023-01-01', '2023-01-01', '2023-01-02'],
    'City': ['NY', 'NY', 'LA', 'LA'],
    'Temperature': [30, 32, 60, 65]
})
Awide_df = df.pivot_table(index='Date', columns='City', values='Temperature', aggfunc='mean')
Bwide_df = df.pivot_table(index='Date', columns='City', values='Temperature', aggfunc='sum')
Cwide_df = df.pivot(index='Date', columns='City', values='Temperature')
Dwide_df = df.pivot(index='Date', columns='City')
Attempts:
2 left
💡 Hint
Use pivot_table with an aggregation function to handle duplicates.
🔧 Debug
advanced
2:00remaining
Identify the error in this pivot code
What error will this code raise when trying to convert the long DataFrame to wide format?
Pandas
import pandas as pd

df = pd.DataFrame({
    'Date': ['2023-01-01', '2023-01-02'],
    'City': ['NY', 'LA'],
    'Temperature': [30, 60]
})

wide_df = df.pivot(index='City', columns='Date')
print(wide_df)
ATypeError: pivot() missing required argument 'values'
BValueError: Index contains duplicate entries, cannot reshape
CNo error, prints wide DataFrame
DKeyError: 'Temperature'
Attempts:
2 left
💡 Hint
`pivot` automatically uses remaining columns as `values` if not specified.
🧠 Conceptual
expert
3:00remaining
Understanding the difference between pivot and pivot_table
Which statement correctly explains the difference between pivot and pivot_table in pandas when converting long to wide format?
A<code>pivot</code> can handle duplicate entries by aggregating them, while <code>pivot_table</code> cannot.
B<code>pivot</code> automatically fills missing values with zeros, while <code>pivot_table</code> leaves them as NaN.
C<code>pivot</code> and <code>pivot_table</code> are identical in functionality and usage.
D<code>pivot_table</code> allows aggregation of duplicate entries using functions like mean or sum, while <code>pivot</code> raises an error if duplicates exist.
Attempts:
2 left
💡 Hint
Think about how each function handles duplicates and missing data.