Understanding array memory layout in NumPy - Complexity Analysis
We want to see how the way numpy stores arrays in memory affects the speed of operations.
How does the memory layout change the time it takes to access or process array elements?
Analyze the time complexity of accessing elements in different numpy array layouts.
import numpy as np
arr_c = np.arange(10000).reshape(100, 100) # C-contiguous (row-major)
arr_f = np.asfortranarray(arr_c) # F-contiguous (column-major)
sum_c = 0
for i in range(100):
for j in range(100):
sum_c += arr_c[i, j]
sum_f = 0
for j in range(100):
for i in range(100):
sum_f += arr_f[i, j]
This code sums all elements of two arrays stored differently in memory, using nested loops in different orders.
Look at the loops and how many times elements are accessed.
- Primary operation: Accessing each element of the array once.
- How many times: 10,000 times (100 rows x 100 columns).
- Two nested loops run fully, but order differs for each array.
As the array size grows, the number of element accesses grows with the total number of elements.
| Input Size (n x n) | Approx. Operations |
|---|---|
| 10 x 10 | 100 |
| 100 x 100 | 10,000 |
| 1000 x 1000 | 1,000,000 |
Pattern observation: Operations grow with the total number of elements, which is n squared.
Time Complexity: O(n²)
This means the time to access all elements grows with the square of the array dimension.
[X] Wrong: "Accessing elements in any order takes the same time because the number of elements is the same."
[OK] Correct: The order of access affects how memory is read. Accessing elements in the order they are stored (contiguous) is faster due to how CPUs and caches work.
Understanding how data layout affects speed shows you think about efficiency beyond just counting loops. This skill helps you write faster code and explain your choices clearly.
"What if we changed the array to be 3D instead of 2D? How would the time complexity change?"