How to Use Stack in pandas: Syntax, Example, and Tips
In pandas,
stack() is used to pivot the columns of a DataFrame into rows, creating a Series with a multi-level index. It helps reshape data from wide to long format by stacking the inner level of columns into the row index.Syntax
The basic syntax of stack() is:
DataFrame.stack(level=-1, dropna=True)
level: Specifies which column level to stack if columns have multiple levels (default is the innermost level).
dropna: If True, it drops missing values in the stacked result.
python
stacked = df.stack(level=-1, dropna=True)
Example
This example shows how stack() converts columns into rows, reshaping the DataFrame from wide to long format.
python
import pandas as pd data = {'A': [1, 2], 'B': [3, 4], 'C': [5, 6]} df = pd.DataFrame(data, index=['row1', 'row2']) stacked = df.stack() print(stacked)
Output
row1 A 1
B 3
C 5
row2 A 2
B 4
C 6
dtype: int64
Common Pitfalls
One common mistake is expecting stack() to work on DataFrames without a proper index or multi-level columns. Also, forgetting that stack() returns a Series, not a DataFrame, can confuse beginners.
Another pitfall is not handling missing values properly; by default, stack() drops NaN values, which might lead to unexpected data loss.
python
import pandas as pd # Wrong: stacking a DataFrame with missing values without knowing dropna behavior data = {'A': [1, None], 'B': [3, 4]} df = pd.DataFrame(data) stacked = df.stack() print(stacked) # Right: explicitly control dropna stacked_keepna = df.stack(dropna=False) print(stacked_keepna)
Output
0 A 1.0
B 3.0
1 B 4.0
dtype: float64
0 A 1.0
B 3.0
1 A NaN
B 4.0
dtype: float64
Quick Reference
| Parameter | Description | Default |
|---|---|---|
| level | Which column level to stack (for MultiIndex columns) | -1 (innermost) |
| dropna | Whether to drop missing values in the result | True |
Key Takeaways
Use
stack() to pivot columns into rows, reshaping data from wide to long format.stack() returns a Series with a multi-level index combining original index and stacked columns.By default,
stack() drops missing values; use dropna=False to keep them.Stacking works best with a proper index and can handle MultiIndex columns by specifying the level.
Remember that
stack() changes the shape and type of your data, so plan accordingly.