Swapping index levels in Pandas - Time & Space Complexity
When working with tables that have multiple index layers, swapping these layers is common.
We want to know how the time to swap index levels changes as the table grows.
Analyze the time complexity of the following code snippet.
import pandas as pd
n, m = 10, 5 # Define n and m before use
# Create a DataFrame with a MultiIndex
index = pd.MultiIndex.from_tuples(
[(i, j) for i in range(n) for j in range(m)],
names=['level_0', 'level_1']
)
df = pd.DataFrame({'value': range(n * m)}, index=index)
# Swap the two index levels
swapped_df = df.swaplevel('level_0', 'level_1')
This code creates a table with two index layers and swaps their order.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: Swapping references to the codes arrays of the two levels.
- How many times: A constant number of times (independent of the DataFrame size).
Swapping index levels involves swapping the codes and levels lists for the two levels, which is independent of the number of rows.
| Input Size (n x m rows) | Approx. Operations |
|---|---|
| 10 | About 10 operations |
| 100 | About 10 operations |
| 1000 | About 10 operations |
Pattern observation: The work is constant and does not grow with the number of rows.
Time Complexity: O(1)
This means the time to swap index levels stays constant as the table gets bigger.
[X] Wrong: "Swapping index levels takes time proportional to the number of rows."
[OK] Correct: Pandas swaps the level codes arrays by reference in constant time, without iterating over the rows.
Understanding how operations scale with data size helps you write efficient code and explain your choices clearly.
"What if the DataFrame had three index levels instead of two? How would the time complexity change when swapping two levels?"