PandasHow-ToBeginner · 2 min read

Pandas How to Convert Wide to Long Format Easily

Use pandas.melt() to convert a wide DataFrame to long format by specifying the id_vars (columns to keep) and value_vars (columns to unpivot), like pd.melt(df, id_vars=['id'], value_vars=['A', 'B']).

📋

Examples

Inputdf = pd.DataFrame({'id': [1, 2], 'A': [10, 20], 'B': [30, 40]})

Output id variable value 0 1 A 10 1 2 A 20 2 1 B 30 3 2 B 40

Inputdf = pd.DataFrame({'Name': ['Tom', 'Jerry'], 'Math': [90, 80], 'Science': [85, 95]})

Output Name Subject Score 0 Tom Math 90 1 Jerry Math 80 2 Tom Science 85 3 Jerry Science 95

Inputdf = pd.DataFrame({'id': [1], 'X': [5], 'Y': [10]})

Output id variable value 0 1 X 5 1 1 Y 10

🧠

How to Think About It

To convert wide to long format, think about keeping some columns fixed (id_vars) and turning other columns (value_vars) into two columns: one for the variable name and one for the value. This reshapes the data from many columns into fewer columns but more rows.

📐

Algorithm

Identify columns to keep as identifiers (id_vars).

Identify columns to unpivot into long format (value_vars).

Use pandas melt function with these parameters.

Return the reshaped DataFrame.

💻

Code

pandas

import pandas as pd

df = pd.DataFrame({'id': [1, 2], 'A': [10, 20], 'B': [30, 40]})
long_df = pd.melt(df, id_vars=['id'], value_vars=['A', 'B'], var_name='variable', value_name='value')
print(long_df)

Output

id variable value 0 1 A 10 1 2 A 20 2 1 B 30 3 2 B 40

🔍

Dry Run

Let's trace the example DataFrame through the melt function.

Original DataFrame

id: [1, 2], A: [10, 20], B: [30, 40]

Identify id_vars and value_vars

id_vars = ['id'], value_vars = ['A', 'B']

Melt operation

Rows created for each id and each variable in A and B with their values

id	variable	value
1	A	10
2	A	20
1	B	30
2	B	40

💡

Why This Works

Step 1: Keep id_vars fixed

The columns in id_vars stay the same for each row in the long format.

Step 2: Unpivot value_vars

Columns in value_vars are turned into two columns: one for the variable name and one for its value.

Step 3: Create longer DataFrame

This reshaping increases rows but reduces columns, making data easier to analyze in some cases.

🔄

Alternative Approaches

stack() method

pandas

import pandas as pd

df = pd.DataFrame({'id': [1, 2], 'A': [10, 20], 'B': [30, 40]})
long_df = df.set_index('id').stack().reset_index(name='value').rename(columns={'level_1': 'variable'})
print(long_df)

Stack works well but requires setting index first; melt is more straightforward for this task.

wide_to_long() function

pandas

import pandas as pd

df = pd.DataFrame({'id': [1, 2], 'A_2019': [10, 20], 'A_2020': [15, 25]})
long_df = pd.wide_to_long(df, stubnames='A', i='id', j='year', sep='_').reset_index()
print(long_df)

wide_to_long is useful for columns with common prefixes and suffixes but needs specific column naming.

⚡

Complexity: O(nm) time, O(nm) space

Time Complexity

The melt operation processes each cell in the value_vars columns, so time grows with number of rows (n) times number of columns to unpivot (m).

Space Complexity

The output DataFrame has more rows, roughly n*m, so space grows accordingly.

Which Approach is Fastest?

pd.melt is optimized and usually faster and simpler than stack or wide_to_long for general wide-to-long reshaping.

Approach	Time	Space	Best For
pd.melt	O(n*m)	O(n*m)	General wide to long conversion
stack	O(n*m)	O(n*m)	When index setting is convenient
wide_to_long	O(n*m)	O(n*m)	Columns with common prefixes/suffixes

💡

Use pd.melt() with id_vars to keep columns fixed and value_vars to unpivot others.

⚠️

Forgetting to specify id_vars causes all columns to be unpivoted, losing important identifiers.

Examples

How to Think About It

Algorithm

Code

Dry Run

Original DataFrame

Identify id_vars and value_vars

Melt operation

Why This Works

Step 1: Keep id_vars fixed

Step 2: Unpivot value_vars

Step 3: Create longer DataFrame

Alternative Approaches

Complexity: O(n*m) time, O(n*m) space

Time Complexity

Space Complexity

Which Approach is Fastest?

Complexity: O(nm) time, O(nm) space