Challenge - 5 Problems

🎖️

Data Reshaping Mastery

Get all challenges correct to earn this badge!

Test your skills under time pressure!

🧠 Conceptual

intermediate

1:30remaining

Why reshape data before analysis?

Why do data scientists often reshape data before analyzing it?

ATo organize data into a format that makes analysis easier and more meaningful

BTo convert numerical data into text format

CTo encrypt data for security reasons

DTo reduce the size of the dataset by removing rows

Attempts:

2 left

❓ data_output

intermediate

2:00remaining

Output of pivot operation on sales data

Given this sales data, what is the output after pivoting by 'Month' with 'Product' as columns and 'Sales' as values?

import pandas as pd
data = pd.DataFrame({
    'Month': ['Jan', 'Jan', 'Feb', 'Feb'],
    'Product': ['A', 'B', 'A', 'B'],
    'Sales': [100, 150, 200, 250]
})
pivoted = data.pivot(index='Month', columns='Product', values='Sales')
print(pivoted)

Month  Product  Sales
Jan    A        100
Jan    B        150
Feb    A        200
Feb    B        250

Month  Product  Sales
0      Jan        A    100
1      Jan        B    150
2      Feb        A    200
3      Feb        B    250

      A    B
Month             
Jan    100  150
Feb    200  250

Product  Month  Sales
A        Jan    100
B        Jan    150
A        Feb    200
B        Feb    250

Attempts:

2 left

🔧 Debug

advanced

1:30remaining

Identify error in melting data

What error does this code produce when trying to reshape data using melt?

import pandas as pd
data = pd.DataFrame({
    'ID': [1, 2],
    'Math': [90, 80],
    'Science': [85, 95]
})
melted = pd.melt(data, id_vars=['ID'], value_vars=['Math', 'History'])
print(melted)

ATypeError: id_vars must be a list

BKeyError: 'History'

CValueError: value_vars cannot be empty

DNo error, prints melted DataFrame

Attempts:

2 left

❓ visualization

advanced

2:00remaining

Visualizing reshaped data with a heatmap

Which option shows the correct heatmap code to visualize sales data after pivoting by 'Month' and 'Product'?

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
data = pd.DataFrame({
    'Month': ['Jan', 'Jan', 'Feb', 'Feb'],
    'Product': ['A', 'B', 'A', 'B'],
    'Sales': [100, 150, 200, 250]
})
pivoted = data.pivot(index='Month', columns='Product', values='Sales')

sns.heatmap(pivoted, annot=True)
plt.show()

plt.plot(pivoted)
plt.show()

sns.barplot(data=pivoted)
plt.show()

sns.heatmap(data, annot=True)
plt.show()

Attempts:

2 left

🚀 Application

expert

2:30remaining

Choosing the right reshape method for time series analysis

You have a dataset with daily temperature readings for multiple cities in wide format (each city is a column). You want to analyze temperature trends over time for each city. Which reshaping method is best to prepare the data?

AUse groupby to aggregate temperatures by city without reshaping

BUse pivot to convert the data into a wider format with cities as rows

CUse concat to join city columns into a single column without changing the format

DUse melt to convert the wide format into long format with columns for date, city, and temperature

Attempts:

2 left