0
0
Pandasdata~20 mins

Why reshaping data matters in Pandas - Challenge Your Understanding

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
Reshaping Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
Predict Output
intermediate
2:00remaining
What is the output shape after pivoting?

Given the DataFrame df below, what will be the shape of the DataFrame after applying pivot?

Pandas
import pandas as pd

df = pd.DataFrame({
    'Date': ['2024-01-01', '2024-01-01', '2024-01-02', '2024-01-02'],
    'City': ['NY', 'LA', 'NY', 'LA'],
    'Temperature': [30, 60, 28, 65]
})

pivoted = df.pivot(index='Date', columns='City', values='Temperature')
print(pivoted.shape)
A(4, 1)
B(2, 4)
C(1, 4)
D(2, 2)
Attempts:
2 left
💡 Hint

Think about how many unique dates and cities there are.

data_output
intermediate
2:00remaining
Result of melting a DataFrame

What is the resulting DataFrame after melting the following data?

Pandas
import pandas as pd

df = pd.DataFrame({
    'ID': [1, 2],
    'Math': [90, 80],
    'Science': [85, 95]
})
melted = pd.melt(df, id_vars=['ID'], var_name='Subject', value_name='Score')
print(melted)
A[{'ID':1,'Subject':'Math','Score':90},{'ID':2,'Subject':'Science','Score':95}]
B[{'ID':1,'Subject':'Math','Score':90},{'ID':1,'Subject':'Science','Score':85},{'ID':2,'Subject':'Math','Score':80},{'ID':2,'Subject':'Science','Score':95}]
C[{'Subject':'Math','Score':[90,80]},{'Subject':'Science','Score':[85,95]}]
D[{'ID':1,'Math':90,'Science':85},{'ID':2,'Math':80,'Science':95}]
Attempts:
2 left
💡 Hint

Melt turns columns into rows keeping the id_vars fixed.

visualization
advanced
3:00remaining
Visualizing reshaped data for comparison

You have sales data for two products across three months in wide format. Which plot best shows the monthly sales comparison after reshaping the data to long format?

Pandas
import pandas as pd
import matplotlib.pyplot as plt

df = pd.DataFrame({
    'Month': ['Jan', 'Feb', 'Mar'],
    'Product_A': [100, 120, 130],
    'Product_B': [90, 110, 115]
})

long_df = pd.melt(df, id_vars=['Month'], var_name='Product', value_name='Sales')

plt.figure(figsize=(6,4))
for product in long_df['Product'].unique():
    subset = long_df[long_df['Product'] == product]
    plt.plot(subset['Month'], subset['Sales'], marker='o', label=product)
plt.legend()
plt.title('Monthly Sales Comparison')
plt.xlabel('Month')
plt.ylabel('Sales')
plt.tight_layout()
plt.show()
ALine plot with months on x-axis and sales on y-axis, separate lines for each product
BPie chart showing total sales per product
CBar chart with months on x-axis and sales stacked by product
DScatter plot with sales on x-axis and months on y-axis
Attempts:
2 left
💡 Hint

Think about how to compare trends over time for multiple products.

🔧 Debug
advanced
2:00remaining
Identify the error in reshaping code

What error will this code raise?

Pandas
import pandas as pd

df = pd.DataFrame({
    'Date': ['2024-01-01', '2024-01-02'],
    'City': ['NY', 'LA'],
    'Temperature': [30, 60]
})

pivoted = df.pivot(index='Date', columns='Temperature', values='City')
print(pivoted)
ANo error, prints pivoted DataFrame
BKeyError: 'Temperature'
CValueError: Index contains duplicate entries, cannot reshape
DTypeError: unhashable type: 'list'
Attempts:
2 left
💡 Hint

Check if the index and columns have unique values.

🚀 Application
expert
3:00remaining
Choosing reshaping method for time series analysis

You have a DataFrame with daily sales data for multiple stores in wide format, where each store is a column. You want to prepare the data for time series analysis that requires a single sales column and a store identifier column. Which reshaping method should you use?

AUse <code>groupby</code> to aggregate sales by store
BUse <code>pivot</code> to convert the data to long format
CUse <code>melt</code> to convert the data to long format
DUse <code>concat</code> to join all store columns vertically
Attempts:
2 left
💡 Hint

Think about turning columns into rows while keeping identifiers.