How to Reshape Data in Python: Simple Guide with Examples
To reshape data in Python, use the
pandas library with methods like pivot(), melt(), and stack(). These functions let you change the layout of your data frames easily by converting between wide and long formats or rearranging rows and columns.Syntax
Here are common pandas methods to reshape data:
pivot(index, columns, values): Converts long data to wide format.melt(id_vars, value_vars): Converts wide data to long format.stack(): Moves columns to rows, creating a multi-level index.unstack(): Moves rows to columns, the opposite ofstack().
python
import pandas as pd # pivot syntax # df.pivot(index='row_id', columns='column_id', values='value_column') # melt syntax # pd.melt(df, id_vars=['id_columns'], value_vars=['value_columns']) # stack syntax # df.stack() # unstack syntax # df.unstack()
Example
This example shows how to reshape a simple data frame from long to wide format using pivot() and back to long format using melt().
python
import pandas as pd data = { 'Date': ['2024-01-01', '2024-01-01', '2024-01-02', '2024-01-02'], 'City': ['New York', 'Los Angeles', 'New York', 'Los Angeles'], 'Temperature': [30, 60, 28, 65] } df = pd.DataFrame(data) # Reshape from long to wide format wide_df = df.pivot(index='Date', columns='City', values='Temperature') print('Wide format:') print(wide_df) # Reshape back to long format long_df = wide_df.reset_index().melt(id_vars='Date', value_vars=['New York', 'Los Angeles'], var_name='City', value_name='Temperature') print('\nLong format:') print(long_df)
Output
Wide format:
City Los Angeles New York
Date
2024-01-01 60 30
2024-01-02 65 28
Long format:
Date City Temperature
0 2024-01-01 Los Angeles 60
1 2024-01-02 Los Angeles 65
2 2024-01-01 New York 30
3 2024-01-02 New York 28
Common Pitfalls
Common mistakes when reshaping data include:
- Using
pivot()when the data has duplicate entries for the same index and column combination, which causes errors. - Not resetting the index before using
melt(), which can lead to unexpected columns. - Confusing
stack()andunstack()usage, leading to wrong data shapes.
Always check your data for duplicates and understand the shape you want before reshaping.
python
import pandas as pd data = { 'Date': ['2024-01-01', '2024-01-01', '2024-01-01'], 'City': ['New York', 'New York', 'Los Angeles'], 'Temperature': [30, 32, 60] } df = pd.DataFrame(data) # This will raise an error because of duplicate Date and City try: df.pivot(index='Date', columns='City', values='Temperature') except ValueError as e: print('Error:', e) # Correct approach: use pivot_table with aggregation pivot_table = df.pivot_table(index='Date', columns='City', values='Temperature', aggfunc='mean') print('\nPivot table with aggregation:') print(pivot_table)
Output
Error: Index contains duplicate entries, cannot reshape
Pivot table with aggregation:
City Los Angeles New York
Date
2024-01-01 60.0 31.0
Quick Reference
| Method | Purpose | Key Parameters |
|---|---|---|
| pivot() | Convert long to wide format | index, columns, values |
| melt() | Convert wide to long format | id_vars, value_vars |
| stack() | Move columns to rows | None |
| unstack() | Move rows to columns | level (optional) |
| pivot_table() | Pivot with aggregation for duplicates | index, columns, values, aggfunc |
Key Takeaways
Use pandas methods like pivot(), melt(), stack(), and unstack() to reshape data frames.
pivot() requires unique index-column pairs; use pivot_table() with aggfunc for duplicates.
melt() converts wide data to long format and often needs id_vars to keep fixed columns.
Check your data shape before and after reshaping to avoid confusion.
Reset index when needed to keep data consistent during reshaping.