map() for element-wise transformation in Pandas - Time & Space Complexity
We want to understand how the time taken by pandas' map() grows as the data size grows.
Specifically, how does applying a function to each element scale with more data?
Analyze the time complexity of the following code snippet.
import pandas as pd
df = pd.DataFrame({'A': range(1, 101)})
df['B'] = df['A'].map(lambda x: x * 2)
This code creates a DataFrame with 100 numbers and uses map() to double each number in column 'A', storing results in column 'B'.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: Applying the function to each element in the column.
- How many times: Once for each element in the column (n times).
As the number of elements grows, the total work grows roughly the same amount.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | 10 function calls |
| 100 | 100 function calls |
| 1000 | 1000 function calls |
Pattern observation: Doubling the input size roughly doubles the work done.
Time Complexity: O(n)
This means the time grows linearly with the number of elements you apply map() to.
[X] Wrong: "Using map() applies the function instantly to all elements at once, so time does not grow with data size."
[OK] Correct: Actually, map() applies the function to each element one by one, so more elements mean more work and more time.
Understanding how element-wise operations scale helps you write efficient data transformations and explain your code clearly in interviews.
"What if we replaced map() with a vectorized operation like * 2 directly on the column? How would the time complexity change?"