How to Calculate Rolling Sum in pandas: Simple Guide
To calculate a rolling sum in pandas, use the
rolling(window) method on a DataFrame or Series followed by sum(). This computes the sum of values over a sliding window of specified size across your data.Syntax
The basic syntax to calculate a rolling sum is:
data.rolling(window).sum()
Here, data is your pandas Series or DataFrame column.
window is the size of the moving window (number of rows) to sum over.
The method returns a new Series or DataFrame with the rolling sums.
python
rolling_sum = data.rolling(window=3).sum()
Example
This example shows how to calculate a rolling sum with a window size of 3 on a pandas Series.
python
import pandas as pd data = pd.Series([1, 2, 3, 4, 5, 6]) rolling_sum = data.rolling(window=3).sum() print(rolling_sum)
Output
0 NaN
1 NaN
2 6.0
3 9.0
4 12.0
5 15.0
dtype: float64
Common Pitfalls
Common mistakes when calculating rolling sums include:
- Not setting the
windowsize correctly, which changes the result. - Expecting results for the first few rows where the window is not full; these return
NaNby default. - Using rolling on DataFrames without specifying the column, which can cause confusion.
To avoid NaN at the start, you can use min_periods=1 to get sums with fewer values.
python
import pandas as pd data = pd.Series([1, 2, 3, 4, 5]) # Wrong: no min_periods, first two are NaN print(data.rolling(window=3).sum()) # Right: min_periods=1 to avoid NaN print(data.rolling(window=3, min_periods=1).sum())
Output
0 NaN
1 NaN
2 6.0
3 9.0
4 12.0
dtype: float64
0 1.0
1 3.0
2 6.0
3 9.0
4 12.0
dtype: float64
Quick Reference
Summary tips for rolling sums in pandas:
- Use
rolling(window).sum()to get rolling sums. windowdefines how many rows to include.- Use
min_periods=1to avoidNaNat the start. - Works on Series and DataFrame columns.
Key Takeaways
Use pandas' rolling(window).sum() to calculate rolling sums easily.
The window size controls how many values are summed at each step.
By default, rolling sums return NaN until the window is full; use min_periods=1 to change this.
Rolling sums work on both Series and DataFrame columns.
Always check your window size and min_periods to get expected results.