0
0
PandasHow-ToBeginner · 3 min read

How to Use Rolling Window in pandas for Data Analysis

Use pandas.DataFrame.rolling(window) to create a rolling window object that slides over your data. Then apply functions like .mean(), .sum(), or custom functions to calculate statistics over each window.
📐

Syntax

The basic syntax for using a rolling window in pandas is:

  • df.rolling(window): Creates a rolling window object with the specified window size.
  • window: Number of observations used for calculating the statistic.
  • After creating the rolling object, apply aggregation functions like .mean(), .sum(), .std(), etc.
python
df.rolling(window).function()
💻

Example

This example shows how to calculate the rolling mean over a window of 3 rows in a pandas DataFrame.

python
import pandas as pd

data = {'values': [10, 20, 30, 40, 50]}
df = pd.DataFrame(data)

# Calculate rolling mean with window size 3
df['rolling_mean'] = df['values'].rolling(window=3).mean()

print(df)
Output
values rolling_mean 0 10 NaN 1 20 NaN 2 30 20.0 3 40 30.0 4 50 40.0
⚠️

Common Pitfalls

  • NaN values at the start: The first window - 1 rows will have NaN because there is not enough data to fill the window.
  • Window size too large: If the window is larger than the data length, all results will be NaN.
  • Non-numeric data: Rolling functions require numeric data; non-numeric columns will cause errors.
python
import pandas as pd

data = {'values': [10, 20, 30]}
df = pd.DataFrame(data)

# Wrong: window larger than data length
print(df['values'].rolling(window=5).mean())

# Right: window smaller or equal to data length
print(df['values'].rolling(window=2).mean())
Output
0 NaN 1 NaN 2 NaN Name: values, dtype: float64 0 NaN 1 15.0 2 25.0 Name: values, dtype: float64
📊

Quick Reference

MethodDescription
rolling(window)Creates rolling window object with specified size
.mean()Calculates mean over each rolling window
.sum()Calculates sum over each rolling window
.std()Calculates standard deviation over each rolling window
.apply(func)Applies custom function to each rolling window

Key Takeaways

Use df.rolling(window) to create a rolling window over your data.
Apply aggregation functions like .mean() or .sum() on the rolling object to get moving statistics.
The first window-1 results are NaN because the window is not full yet.
Ensure your window size is appropriate for your data length to avoid all NaN results.
Rolling functions work only on numeric data columns.