np.sum() and axis parameter in NumPy - Time & Space Complexity
We want to understand how the time it takes to add numbers using np.sum() changes as the size of the data grows.
Specifically, how does choosing different axes affect the work done?
Analyze the time complexity of the following code snippet.
import numpy as np
arr = np.random.rand(1000, 500)
sum_all = np.sum(arr)
sum_axis0 = np.sum(arr, axis=0)
sum_axis1 = np.sum(arr, axis=1)
This code creates a 2D array and sums all elements, or sums along rows or columns.
Look at what repeats when summing.
- Primary operation: Adding each number in the array.
- How many times: Once for each element in the array, no matter which axis is chosen.
As the array gets bigger, the number of additions grows in direct proportion.
| Input Size (n x m) | Approx. Operations |
|---|---|
| 10 x 10 | 100 additions |
| 100 x 100 | 10,000 additions |
| 1000 x 1000 | 1,000,000 additions |
Pattern observation: Doubling the size in each dimension roughly quadruples the work because total elements increase.
Time Complexity: O(n * m)
This means the time to sum grows directly with the total number of elements in the array.
[X] Wrong: "Summing along an axis is faster because it looks like fewer sums happen."
[OK] Correct: Even when summing along one axis, every element is still added once, so the total work stays the same.
Knowing how array operations scale helps you write efficient code and explain your choices clearly in real projects or interviews.
What if we used np.sum() on a 3D array with the axis parameter? How would the time complexity change?