Challenge - 5 Problems
Long to Wide Format Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
❓ Predict Output
intermediate2:00remaining
Output of pivot operation on a simple DataFrame
Given the following pandas DataFrame in long format, what is the output after applying
pivot to convert it to wide format?Pandas
import pandas as pd df = pd.DataFrame({ 'Date': ['2023-01-01', '2023-01-01', '2023-01-02', '2023-01-02'], 'City': ['NY', 'LA', 'NY', 'LA'], 'Temperature': [30, 60, 28, 65] }) wide_df = df.pivot(index='Date', columns='City', values='Temperature') print(wide_df)
Attempts:
2 left
💡 Hint
Remember that
pivot uses the index as rows and columns as columns in the new DataFrame.✗ Incorrect
The
pivot method rearranges the DataFrame so that 'Date' becomes the index (rows), 'City' becomes the columns, and 'Temperature' fills the values. The columns are sorted alphabetically by default, so 'LA' comes before 'NY'.❓ data_output
intermediate2:00remaining
Number of columns after pivot with multiple value columns
Consider this DataFrame with multiple measurement columns in long format. After pivoting with
pivot using index='Date' and columns='City', how many columns will the resulting wide DataFrame have?Pandas
import pandas as pd df = pd.DataFrame({ 'Date': ['2023-01-01', '2023-01-01', '2023-01-02', '2023-01-02'], 'City': ['NY', 'LA', 'NY', 'LA'], 'Temperature': [30, 60, 28, 65], 'Humidity': [55, 40, 60, 42] }) # Attempt to pivot # wide_df = df.pivot(index='Date', columns='City') # What happens?
Attempts:
2 left
💡 Hint
When no `values` is specified, `pivot` uses all remaining columns (`Temperature`, `Humidity`), creating MultiIndex columns: 2 × 2 cities = 4 columns.
✗ Incorrect
`pivot` without `values` pivots all columns except `index` and `columns`, producing a DataFrame with MultiIndex columns ('Humidity'/'Temperature' × 'LA'/'NY') totaling 4 columns.
🚀 Application
advanced2:30remaining
Using pivot_table to handle duplicates in long to wide conversion
You have this long format DataFrame with duplicate entries for the same Date and City. Which code snippet correctly converts it to wide format by averaging duplicates?
Pandas
import pandas as pd df = pd.DataFrame({ 'Date': ['2023-01-01', '2023-01-01', '2023-01-01', '2023-01-02'], 'City': ['NY', 'NY', 'LA', 'LA'], 'Temperature': [30, 32, 60, 65] })
Attempts:
2 left
💡 Hint
Use
pivot_table with an aggregation function to handle duplicates.✗ Incorrect
Since there are duplicate Date-City pairs,
pivot raises an error. pivot_table allows aggregation like mean or sum to combine duplicates.🔧 Debug
advanced2:00remaining
Identify the error in this pivot code
What error will this code raise when trying to convert the long DataFrame to wide format?
Pandas
import pandas as pd df = pd.DataFrame({ 'Date': ['2023-01-01', '2023-01-02'], 'City': ['NY', 'LA'], 'Temperature': [30, 60] }) wide_df = df.pivot(index='City', columns='Date') print(wide_df)
Attempts:
2 left
💡 Hint
`pivot` automatically uses remaining columns as `values` if not specified.
✗ Incorrect
With one remaining column ('Temperature') after index='City' and columns='Date', `pivot` implicitly uses it. The code succeeds, printing a wide DataFrame with NaNs for missing combinations.
🧠 Conceptual
expert3:00remaining
Understanding the difference between pivot and pivot_table
Which statement correctly explains the difference between
pivot and pivot_table in pandas when converting long to wide format?Attempts:
2 left
💡 Hint
Think about how each function handles duplicates and missing data.
✗ Incorrect
pivot is simpler but fails if duplicates exist. pivot_table is more flexible and allows aggregation to handle duplicates.