Why indexing matters in NumPy - Performance Analysis
We want to see how fast numpy operations run when we pick parts of data using indexing.
How does the way we select data affect the time it takes?
Analyze the time complexity of the following code snippet.
import numpy as np
arr = np.arange(1000000)
# Access a single element
single = arr[500000]
# Access a slice
slice_ = arr[100000:200000]
# Access with a boolean mask
mask = arr % 2 == 0
filtered = arr[mask]
This code shows three ways to get data from a large array: one element, a slice, and a filtered subset.
Look at what repeats when we pick data.
- Primary operation: Accessing elements by index or condition.
- How many times: For single element, once; for slice, many times equal to slice size; for mask, once per element to check condition.
How does time grow when array size grows?
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | Single: 1, Slice: up to 10, Mask: 10 |
| 100 | Single: 1, Slice: up to 100, Mask: 100 |
| 1000 | Single: 1, Slice: up to 1000, Mask: 1000 |
Picking one element stays fast no matter size. Picking slices or filtered data grows with how much we take or check.
Time Complexity: O(k) where k is the number of elements accessed or checked
This means time grows with how many elements you pick or test, not the total array size if you pick just one.
[X] Wrong: "Accessing any element in numpy always takes the same time regardless of how many elements we get."
[OK] Correct: Accessing one element is fast, but getting many elements or filtering means more work and more time.
Knowing how indexing affects speed helps you write faster code and explain your choices clearly in real projects.
"What if we used fancy indexing with a list of random indices? How would the time complexity change?"