Multi-dimensional fancy indexing in NumPy - Time & Space Complexity
We want to understand how the time needed to pick elements from a multi-dimensional array changes as the array grows.
Specifically, how does fancy indexing with multiple index arrays affect the work done?
Analyze the time complexity of the following code snippet.
import numpy as np
arr = np.arange(10000).reshape(100, 100)
rows = np.array([1, 3, 5, 7])
cols = np.array([2, 4, 6, 8])
result = arr[rows[:, None], cols]
This code selects elements from a 2D array using two index arrays for rows and columns, creating a new array of selected elements.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: Accessing elements for each pair of row and column indices.
- How many times: For each row index, it accesses all column indices, so total accesses = number of rows in index array x number of columns in index array.
As the size of the index arrays grows, the number of element accesses grows by multiplying their lengths.
| Input Size (rows x cols) | Approx. Operations |
|---|---|
| 10 x 10 | 100 |
| 100 x 100 | 10,000 |
| 1000 x 1000 | 1,000,000 |
Pattern observation: The operations grow by multiplying the sizes of the row and column index arrays, so doubling both quadruples the work.
Time Complexity: O(r * c)
This means the time grows proportionally to the product of the lengths of the row and column index arrays.
[X] Wrong: "The time depends only on the size of the original array, not on the index arrays."
[OK] Correct: The original array size does not affect how many elements are picked; the work depends on how many indices we use to select elements.
Understanding how fancy indexing scales helps you reason about data selection performance in real projects, making your code efficient and clear.
"What if we used a single 1D index array instead of two 1D arrays for rows and columns? How would the time complexity change?"