0
0
Pandasdata~5 mins

Rolling mean and sum in Pandas - Time & Space Complexity

Choose your learning style9 modes available
Time Complexity: Rolling mean and sum
O(n)
Understanding Time Complexity

We want to understand how the time to calculate rolling mean and sum changes as the data size grows.

How does the work increase when we have more data points?

Scenario Under Consideration

Analyze the time complexity of the following code snippet.

import pandas as pd

n = 10  # Example value for n

data = pd.Series(range(1, n+1))
window_size = 3
rolling_mean = data.rolling(window=window_size).mean()
rolling_sum = data.rolling(window=window_size).sum()

This code calculates the rolling mean and sum over a sliding window of size 3 on a series of length n.

Identify Repeating Operations

Identify the loops, recursion, array traversals that repeat.

  • Primary operation: For each position in the data, the window of size 3 is processed to compute mean and sum.
  • How many times: This happens once for each of the n data points (except the first few where the window is incomplete).
How Execution Grows With Input

As the data size n grows, the number of operations grows roughly in a straight line.

Input Size (n)Approx. Operations
10About 10 times the window size (3)
100About 100 times the window size (3)
1000About 1000 times the window size (3)

Pattern observation: The total work grows linearly with n because each new data point requires a fixed amount of work related to the window size.

Final Time Complexity

Time Complexity: O(n)

This means the time to compute rolling mean and sum grows in direct proportion to the number of data points.

Common Mistake

[X] Wrong: "Calculating rolling mean and sum takes time proportional to n times the window size squared."

[OK] Correct: The window size is fixed and small, so it does not multiply the work by itself. The main growth depends on n, not on window size squared.

Interview Connect

Understanding how rolling calculations scale helps you explain performance when working with time series or streaming data in real projects.

Self-Check

"What if the window size grew proportionally with n? How would the time complexity change?"