0
0
Pandasdata~20 mins

Wide to long format conversion in Pandas - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
Wide to Long Format Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
Predict Output
intermediate
2:00remaining
Output of pandas melt function
What is the output DataFrame after running this code?
Pandas
import pandas as pd

df = pd.DataFrame({
    'id': [1, 2],
    'A_2020': [10, 20],
    'A_2021': [15, 25],
    'B_2020': [5, 10],
    'B_2021': [7, 14]
})

result = df.melt(id_vars=['id'], var_name='variable', value_name='value')
print(result)
A
   id variable  value
0   1   A_2020     10
1   2   A_2020     20
2   1   A_2021     15
3   2   A_2021     25
4   1   B_2020      5
5   2   B_2020     10
6   1   B_2021      7
7   2   B_2021     14
B
   id variable  value
0   1       A      10
1   2       A      20
2   1       A      15
3   2       A      25
4   1       B       5
5   2       B      10
6   1       B       7
7   2       B      14
C
   id  A_2020  A_2021  B_2020  B_2021
0   1      10      15       5       7
1   2      20      25      10      14
D
   id variable  value
0   1   2020_A     10
1   2   2020_A     20
2   1   2021_A     15
3   2   2021_A     25
4   1   2020_B      5
5   2   2020_B     10
6   1   2021_B      7
7   2   2021_B     14
Attempts:
2 left
💡 Hint
The melt function keeps id_vars as is and stacks other columns into two columns: variable and value.
data_output
intermediate
1:30remaining
Number of rows after wide to long conversion
Given a DataFrame with 3 rows and 4 value columns (excluding id), how many rows will the DataFrame have after using pandas melt with id_vars=['id']?
A7
B12
C15
D3
Attempts:
2 left
💡 Hint
Each original row expands into multiple rows, one per value column melted.
🔧 Debug
advanced
2:00remaining
Identify the error in this wide to long conversion code
What error does this code raise when run?
Pandas
import pandas as pd

df = pd.DataFrame({
    'id': [1, 2],
    'score_1': [10, 20],
    'score_2': [15, 25]
})

result = pd.melt(df, id_vars='id', var_name='test', value_name='score')
print(result)
ATypeError: id_vars must be list-like
BValueError: var_name must be a list
CNo error, prints melted DataFrame
DKeyError: 'id' not found in columns
Attempts:
2 left
💡 Hint
Check the type of id_vars argument in melt function.
🚀 Application
advanced
2:30remaining
Extract year and variable from melted column
After melting a DataFrame with columns like 'A_2020', 'B_2021', how can you split the 'variable' column into two new columns 'var' and 'year'?
Pandas
import pandas as pd

df = pd.DataFrame({
    'id': [1, 2],
    'A_2020': [10, 20],
    'B_2021': [5, 15]
})

melted = df.melt(id_vars=['id'], var_name='variable', value_name='value')

# Fill in the code to split 'variable' into 'var' and 'year'
Amelted['var'], melted['year'] = zip(*melted['variable'].str.split('_'))
Bmelted['var'], melted['year'] = melted['variable'].split('_')
Cmelted[['var', 'year']] = melted['variable'].str.split('_', expand=True)
D
melted['var'] = melted['variable'].apply(lambda x: x.split('_')[0])
melted['year'] = melted['variable'].apply(lambda x: x.split('_')[1])
Attempts:
2 left
💡 Hint
Use pandas string split with expand=True to create multiple columns.
🧠 Conceptual
expert
1:30remaining
Why use wide to long format in data analysis?
Which of these is the best reason to convert data from wide to long format?
AWide format is required for machine learning models.
BWide format is always better for plotting with seaborn and matplotlib.
CLong format reduces the number of rows in the dataset.
DLong format allows easier use of groupby and aggregation functions in pandas.
Attempts:
2 left
💡 Hint
Think about how data shape affects grouping and summarizing.