Challenge - 5 Problems

🎖️

Reshaping Master

Get all challenges correct to earn this badge!

Test your skills under time pressure!

❓ Predict Output

intermediate

2:00remaining

What is the output shape after pivoting?

Given the DataFrame df below, what will be the shape of the DataFrame after applying pivot?

Pandas

import pandas as pd

df = pd.DataFrame({
    'Date': ['2024-01-01', '2024-01-01', '2024-01-02', '2024-01-02'],
    'City': ['NY', 'LA', 'NY', 'LA'],
    'Temperature': [30, 60, 28, 65]
})

pivoted = df.pivot(index='Date', columns='City', values='Temperature')
print(pivoted.shape)

A(4, 1)

B(2, 4)

C(1, 4)

D(2, 2)

Attempts:

2 left

❓ data_output

intermediate

2:00remaining

Result of melting a DataFrame

What is the resulting DataFrame after melting the following data?

Pandas

import pandas as pd

df = pd.DataFrame({
    'ID': [1, 2],
    'Math': [90, 80],
    'Science': [85, 95]
})
melted = pd.melt(df, id_vars=['ID'], var_name='Subject', value_name='Score')
print(melted)

A[{'ID':1,'Subject':'Math','Score':90},{'ID':2,'Subject':'Science','Score':95}]

B[{'ID':1,'Subject':'Math','Score':90},{'ID':1,'Subject':'Science','Score':85},{'ID':2,'Subject':'Math','Score':80},{'ID':2,'Subject':'Science','Score':95}]

C[{'Subject':'Math','Score':[90,80]},{'Subject':'Science','Score':[85,95]}]

D[{'ID':1,'Math':90,'Science':85},{'ID':2,'Math':80,'Science':95}]

Attempts:

2 left

❓ visualization

advanced

3:00remaining

Visualizing reshaped data for comparison

You have sales data for two products across three months in wide format. Which plot best shows the monthly sales comparison after reshaping the data to long format?

Pandas

import pandas as pd
import matplotlib.pyplot as plt

df = pd.DataFrame({
    'Month': ['Jan', 'Feb', 'Mar'],
    'Product_A': [100, 120, 130],
    'Product_B': [90, 110, 115]
})

long_df = pd.melt(df, id_vars=['Month'], var_name='Product', value_name='Sales')

plt.figure(figsize=(6,4))
for product in long_df['Product'].unique():
    subset = long_df[long_df['Product'] == product]
    plt.plot(subset['Month'], subset['Sales'], marker='o', label=product)
plt.legend()
plt.title('Monthly Sales Comparison')
plt.xlabel('Month')
plt.ylabel('Sales')
plt.tight_layout()
plt.show()

ALine plot with months on x-axis and sales on y-axis, separate lines for each product

BPie chart showing total sales per product

CBar chart with months on x-axis and sales stacked by product

DScatter plot with sales on x-axis and months on y-axis

Attempts:

2 left

🔧 Debug

advanced

2:00remaining

Identify the error in reshaping code

What error will this code raise?

Pandas

import pandas as pd

df = pd.DataFrame({
    'Date': ['2024-01-01', '2024-01-02'],
    'City': ['NY', 'LA'],
    'Temperature': [30, 60]
})

pivoted = df.pivot(index='Date', columns='Temperature', values='City')
print(pivoted)

ANo error, prints pivoted DataFrame

BKeyError: 'Temperature'

CValueError: Index contains duplicate entries, cannot reshape

DTypeError: unhashable type: 'list'

Attempts:

2 left

🚀 Application

expert

3:00remaining

Choosing reshaping method for time series analysis

You have a DataFrame with daily sales data for multiple stores in wide format, where each store is a column. You want to prepare the data for time series analysis that requires a single sales column and a store identifier column. Which reshaping method should you use?

AUse <code>groupby</code> to aggregate sales by store

BUse <code>pivot</code> to convert the data to long format

CUse <code>melt</code> to convert the data to long format

DUse <code>concat</code> to join all store columns vertically

Attempts:

2 left