Rolling window calculations in Data Analysis Python - Time & Space Complexity
When we use rolling window calculations, we want to see how values change over a small moving range in data.
We ask: How does the time to compute change as the data grows larger?
Analyze the time complexity of the following code snippet.
import pandas as pd
data = pd.Series(range(1, 101))
window_size = 5
rolling_means = data.rolling(window=window_size).mean()
This code calculates the average of every 5 consecutive numbers in a list of 100 numbers.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: Calculating the mean for each rolling window.
- How many times: Once for each position where the window fits, about n - window_size + 1 times.
As the data size grows, the number of rolling windows grows roughly the same as the data size.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | About 6 calculations |
| 100 | About 96 calculations |
| 1000 | About 996 calculations |
Pattern observation: The number of calculations grows roughly in a straight line as data size increases.
Time Complexity: O(n)
This means the time to compute grows directly in proportion to the size of the data.
[X] Wrong: "Rolling calculations take the same time no matter how big the data is."
[OK] Correct: Each new position of the window requires a calculation, so more data means more calculations.
Understanding how rolling calculations scale helps you explain data processing speed clearly and confidently.
"What if we increased the window size to cover half the data? How would the time complexity change?"