0
0
PandasHow-ToBeginner · 3 min read

How to Use Expanding in pandas for Cumulative Calculations

In pandas, expanding() creates a cumulative window that grows with each row, allowing you to calculate cumulative statistics like sum, mean, or max. You use it by calling expanding() on a DataFrame or Series, then applying an aggregation function such as sum() or mean().
📐

Syntax

The basic syntax for using expanding() in pandas is:

  • DataFrame.expanding(min_periods=1) or Series.expanding(min_periods=1)
  • min_periods sets the minimum number of observations required to have a value (default is 1)
  • After calling expanding(), apply an aggregation function like sum(), mean(), max(), etc.
python
df.expanding(min_periods=1).sum()
💻

Example

This example shows how to calculate the cumulative sum and cumulative mean of a pandas Series using expanding(). The window grows with each row, so the first row uses only itself, the second row uses the first two rows, and so on.

python
import pandas as pd

# Create a simple Series
s = pd.Series([2, 4, 6, 8, 10])

# Calculate cumulative sum
cumulative_sum = s.expanding().sum()

# Calculate cumulative mean
cumulative_mean = s.expanding().mean()

print('Original Series:')
print(s)
print('\nCumulative Sum:')
print(cumulative_sum)
print('\nCumulative Mean:')
print(cumulative_mean)
Output
Original Series: 0 2 1 4 2 6 3 8 4 10 dtype: int64 Cumulative Sum: 0 2.0 1 6.0 2 12.0 3 20.0 4 30.0 dtype: float64 Cumulative Mean: 0 2.0 1 3.0 2 4.0 3 5.0 4 6.0 dtype: float64
⚠️

Common Pitfalls

One common mistake is confusing expanding() with rolling(). expanding() windows grow with each row, while rolling() windows have a fixed size.

Another pitfall is not setting min_periods, which can cause NaN values if the minimum number of observations is not met.

python
import pandas as pd

s = pd.Series([1, 2, 3, 4, 5])

# Wrong: Using rolling when you want cumulative expanding
print('Rolling sum (window=3):')
print(s.rolling(window=3).sum())

# Right: Using expanding for cumulative sum
print('\nExpanding sum:')
print(s.expanding().sum())
Output
Rolling sum (window=3): 0 NaN 1 NaN 2 6.0 3 9.0 4 12.0 dtype: float64 Expanding sum: 0 1.0 1 3.0 2 6.0 3 10.0 4 15.0 dtype: float64
📊

Quick Reference

Here is a quick summary of expanding() usage:

Parameter/MethodDescription
min_periodsMinimum number of observations required to return a value
sum()Cumulative sum of values in the expanding window
mean()Cumulative mean of values in the expanding window
max()Cumulative maximum value in the expanding window
min()Cumulative minimum value in the expanding window
count()Number of non-NA observations in the expanding window

Key Takeaways

Use expanding() to create a cumulative window that grows with each row in pandas.
Apply aggregation functions like sum() or mean() after expanding() to get cumulative statistics.
Remember expanding() differs from rolling() because its window size increases over time.
Set min_periods to control when results start appearing to avoid unexpected NaN values.
Expanding works on both pandas Series and DataFrames for flexible cumulative calculations.