applymap() for DataFrame-wide operations in Pandas - Time & Space Complexity
We want to understand how the time to run applymap() changes as the DataFrame gets bigger.
How does the number of cells affect the work done?
Analyze the time complexity of the following code snippet.
import pandas as pd
import numpy as np
n, m = 10, 10 # example sizes
df = pd.DataFrame(np.random.randint(0, 100, size=(n, m)))
result = df.applymap(lambda x: x * 2)
This code doubles every number in the DataFrame using applymap().
Identify the loops, recursion, array traversals that repeat.
- Primary operation: Applying the function to each cell in the DataFrame.
- How many times: Once for every cell, so total cells = rows x columns.
As the DataFrame grows, the work grows with the number of cells.
| Input Size (rows x columns) | Approx. Operations |
|---|---|
| 10 x 10 = 100 | 100 function calls |
| 100 x 100 = 10,000 | 10,000 function calls |
| 1000 x 1000 = 1,000,000 | 1,000,000 function calls |
Pattern observation: The number of operations grows directly with the total number of cells.
Time Complexity: O(n x m)
This means the time grows proportionally to the total number of cells in the DataFrame.
[X] Wrong: "applymap() runs in constant time regardless of DataFrame size."
[OK] Correct: The function runs once per cell, so more cells mean more work and more time.
Knowing how applymap() scales helps you explain performance when working with large tables.
What if we changed applymap() to apply() on columns? How would the time complexity change?