pct_change() for percentage change in Pandas - Time & Space Complexity
We want to understand how the time needed to calculate percentage changes grows as the data size grows.
How does the work increase when we have more rows in our data?
Analyze the time complexity of the following code snippet.
import pandas as pd
data = pd.Series([10, 20, 30, 40, 50])
pct_changes = data.pct_change()
print(pct_changes)
This code calculates the percentage change between each value and the previous one in a list of numbers.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: The function looks at each element and compares it to the one before it.
- How many times: It does this once for every element except the first.
As the number of data points grows, the work grows roughly the same amount.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | About 9 comparisons |
| 100 | About 99 comparisons |
| 1000 | About 999 comparisons |
Pattern observation: The work grows in a straight line with the number of data points.
Time Complexity: O(n)
This means the time to calculate percentage changes grows directly with the number of values.
[X] Wrong: "pct_change() compares every value to all previous values, so it takes much longer as data grows."
[OK] Correct: pct_change() only compares each value to the one right before it, so it only does one comparison per value, not many.
Understanding how simple operations like pct_change() scale helps you explain your code's efficiency clearly and confidently.
"What if pct_change() was asked to compare each value to the value two steps before instead of one? How would the time complexity change?"