0
0
PandasHow-ToBeginner · 3 min read

How to Use Stack in pandas: Syntax, Example, and Tips

In pandas, stack() is used to pivot the columns of a DataFrame into rows, creating a Series with a multi-level index. It helps reshape data from wide to long format by stacking the inner level of columns into the row index.
📐

Syntax

The basic syntax of stack() is:

  • DataFrame.stack(level=-1, dropna=True)

level: Specifies which column level to stack if columns have multiple levels (default is the innermost level).

dropna: If True, it drops missing values in the stacked result.

python
stacked = df.stack(level=-1, dropna=True)
💻

Example

This example shows how stack() converts columns into rows, reshaping the DataFrame from wide to long format.

python
import pandas as pd

data = {'A': [1, 2], 'B': [3, 4], 'C': [5, 6]}
df = pd.DataFrame(data, index=['row1', 'row2'])

stacked = df.stack()
print(stacked)
Output
row1 A 1 B 3 C 5 row2 A 2 B 4 C 6 dtype: int64
⚠️

Common Pitfalls

One common mistake is expecting stack() to work on DataFrames without a proper index or multi-level columns. Also, forgetting that stack() returns a Series, not a DataFrame, can confuse beginners.

Another pitfall is not handling missing values properly; by default, stack() drops NaN values, which might lead to unexpected data loss.

python
import pandas as pd

# Wrong: stacking a DataFrame with missing values without knowing dropna behavior

data = {'A': [1, None], 'B': [3, 4]}
df = pd.DataFrame(data)
stacked = df.stack()
print(stacked)

# Right: explicitly control dropna
stacked_keepna = df.stack(dropna=False)
print(stacked_keepna)
Output
0 A 1.0 B 3.0 1 B 4.0 dtype: float64 0 A 1.0 B 3.0 1 A NaN B 4.0 dtype: float64
📊

Quick Reference

ParameterDescriptionDefault
levelWhich column level to stack (for MultiIndex columns)-1 (innermost)
dropnaWhether to drop missing values in the resultTrue

Key Takeaways

Use stack() to pivot columns into rows, reshaping data from wide to long format.
stack() returns a Series with a multi-level index combining original index and stacked columns.
By default, stack() drops missing values; use dropna=False to keep them.
Stacking works best with a proper index and can handle MultiIndex columns by specifying the level.
Remember that stack() changes the shape and type of your data, so plan accordingly.