0
0
Pandasdata~10 mins

Wide to long format conversion in Pandas - Step-by-Step Execution

Choose your learning style9 modes available
Concept Flow - Wide to long format conversion
Start with wide DataFrame
Select columns to keep as id_vars
Select columns to melt into long format
Apply pd.melt() function
Create long format DataFrame
Use or visualize long DataFrame
Convert a wide table with many columns into a longer table with fewer columns by melting selected columns.
Execution Sample
Pandas
import pandas as pd

df = pd.DataFrame({
    'Name': ['Alice', 'Bob'],
    'Math': [90, 80],
    'English': [85, 88]
})

long_df = pd.melt(df, id_vars=['Name'], var_name='Subject', value_name='Score')
This code converts a wide DataFrame with subjects as columns into a long DataFrame with one subject column and one score column.
Execution Table
StepActionDataFrame StateResult
1Create wide DataFrame{'Name': ['Alice', 'Bob'], 'Math': [90, 80], 'English': [85, 88]}DataFrame with 3 columns and 2 rows
2Select id_vars=['Name']Columns to keep: ['Name']Name column stays as is
3Select value_vars=['Math', 'English']Columns to melt: ['Math', 'English']These columns will be converted to rows
4Apply pd.melt()Melt columns into long formatDataFrame with columns: Name, Subject, Score
5Resulting long DataFrameRows: 4 (2 names x 2 subjects)DataFrame: Name Subject Score Alice Math 90 Bob Math 80 Alice English 85 Bob English 88
6EndNo further changesConversion complete
💡 All specified columns melted; long format DataFrame created
Variable Tracker
VariableStartAfter Step 1After Step 4Final
dfundefined{'Name': ['Alice', 'Bob'], 'Math': [90, 80], 'English': [85, 88]}Same wide DataFrameSame wide DataFrame
long_dfundefinedundefinedLong format DataFrame with columns Name, Subject, ScoreSame long format DataFrame
Key Moments - 2 Insights
Why do we need to specify id_vars in pd.melt()?
id_vars are columns to keep as identifiers and not melt. Without them, all columns would be melted, losing the original row identity. See execution_table step 2 and 3.
What happens if we don't specify var_name and value_name?
pd.melt() uses default column names 'variable' and 'value' for the melted columns. Specifying var_name and value_name gives meaningful names. See execution_table step 4.
Visual Quiz - 3 Questions
Test your understanding
Look at the execution table at step 5, how many rows does the long DataFrame have?
A2
B3
C4
D6
💡 Hint
Check step 5 in execution_table: 2 names times 2 subjects equals 4 rows.
At which step do we specify which columns to keep as identifiers?
AStep 1
BStep 2
CStep 4
DStep 5
💡 Hint
See execution_table step 2 where id_vars=['Name'] is selected.
If we omit id_vars in pd.melt(), what happens to the 'Name' column?
AIt becomes part of the melted variable column
BIt is dropped from the DataFrame
CIt stays as a column
DIt becomes the index
💡 Hint
Without id_vars, all columns are melted, so 'Name' becomes part of the variable column (see key_moments question 1).
Concept Snapshot
pd.melt() converts wide to long format.
Use id_vars to keep columns as identifiers.
Use var_name and value_name to name melted columns.
Resulting DataFrame has fewer columns and more rows.
Useful for tidy data and plotting.
Full Transcript
We start with a wide DataFrame that has one row per person and multiple columns for subjects. We want to convert it to a long format where each row is one person-subject pair with a score. We keep the 'Name' column as an identifier using id_vars. We melt the subject columns into two columns: one for subject names and one for scores. The pd.melt() function does this conversion. The result is a longer DataFrame with columns Name, Subject, and Score. This format is easier for analysis and plotting. Key points are to specify id_vars to keep identifiers and optionally name the new columns with var_name and value_name.