Strides and how data is accessed in NumPy - Time & Space Complexity
When working with numpy arrays, how data is accessed matters for speed.
We want to see how the way numpy reads data affects the time it takes.
Analyze the time complexity of the following numpy code snippet.
import numpy as np
arr = np.arange(10000).reshape(100, 100)
sub_arr = arr[:, ::2]
result = np.sum(sub_arr)
This code creates a 100x100 array, takes every second column, then sums all values.
Look for repeated steps that take most time.
- Primary operation: Summing all elements in the sliced array.
- How many times: Once for each element in the sliced array (100 rows x 50 columns = 5000 elements).
As the array size grows, the sum operation takes longer because it reads more elements.
| Input Size (n x n) | Approx. Operations (sum elements) |
|---|---|
| 10 x 10 | 50 (half columns) |
| 100 x 100 | 5,000 |
| 1000 x 1000 | 500,000 |
Pattern observation: Operations grow roughly with the number of elements accessed, which depends on the slice size.
Time Complexity: O(n^2)
This means the time grows roughly with the total number of elements accessed in the sliced array.
[X] Wrong: "Slicing with steps like ::2 makes the operation twice as fast always."
[OK] Correct: Even with slicing, numpy still reads each accessed element, so time depends on how many elements are included, not just the step size.
Understanding how numpy accesses data helps you write faster code and explain performance clearly in interviews.
What if we changed the slice to arr[::2, ::2]? How would the time complexity change?