Resetting index in Pandas - Time & Space Complexity
When we reset the index of a pandas DataFrame, we rearrange how rows are labeled. Understanding how long this takes helps us work efficiently with data.
We want to know how the time to reset the index changes as the DataFrame gets bigger.
Analyze the time complexity of the following code snippet.
import pandas as pd
n = 10 # Example size
df = pd.DataFrame({
'A': range(n),
'B': range(n, 0, -1)
})
# Reset the index of the DataFrame
df_reset = df.reset_index(drop=True)
This code creates a DataFrame with n rows and then resets its index to default integer labels.
- Primary operation: Copying or reassigning the index labels for each row.
- How many times: Once for each row in the DataFrame (n times).
As the number of rows grows, the work to reset the index grows too, because each row's label must be updated.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | About 10 operations |
| 100 | About 100 operations |
| 1000 | About 1000 operations |
Pattern observation: The operations increase directly with the number of rows.
Time Complexity: O(n)
This means the time to reset the index grows in a straight line with the number of rows.
[X] Wrong: "Resetting the index is instant no matter how big the DataFrame is."
[OK] Correct: Even though it feels simple, pandas must update each row's label, so bigger DataFrames take more time.
Knowing how operations like resetting an index scale helps you write efficient data code and explain your choices clearly in real projects.
"What if we reset the index but keep the old index as a new column? How would the time complexity change?"