Renaming columns in Pandas - Time & Space Complexity
We want to understand how the time it takes to rename columns in a pandas DataFrame changes as the number of columns grows.
How does the work increase when we rename more columns?
Analyze the time complexity of the following code snippet.
import pandas as pd
df = pd.DataFrame({f'col{i}': range(5) for i in range(100)})
new_names = {f'col{i}': f'new_col{i}' for i in range(100)}
df.rename(columns=new_names, inplace=True)
This code creates a DataFrame with 100 columns and renames all columns using a dictionary mapping.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: pandas goes through each column name to check if it needs to be renamed.
- How many times: Once for each column in the DataFrame.
As the number of columns increases, the time to rename grows roughly in direct proportion.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | About 10 checks and renames |
| 100 | About 100 checks and renames |
| 1000 | About 1000 checks and renames |
Pattern observation: The work grows linearly as the number of columns grows.
Time Complexity: O(n)
This means the time to rename columns grows in a straight line with the number of columns.
[X] Wrong: "Renaming columns happens instantly no matter how many columns there are."
[OK] Correct: Even though renaming is fast, pandas must check each column name, so more columns mean more work.
Understanding how operations scale with data size helps you write efficient code and explain your choices clearly in real projects.
"What if we rename only a few columns instead of all? How would the time complexity change?"