Boolean indexing in NumPy - Time & Space Complexity
We want to understand how the time it takes to select data using boolean indexing changes as the data size grows.
How does the work grow when we filter arrays with conditions?
Analyze the time complexity of the following code snippet.
import numpy as np
arr = np.arange(1000)
mask = arr % 2 == 0
filtered = arr[mask]
This code creates an array, makes a mask for even numbers, and selects those numbers using boolean indexing.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: Checking each element against the condition (arr % 2 == 0) and then selecting elements based on the mask.
- How many times: Once for each element in the array, so n times where n is the array size.
As the array size grows, the number of checks and selections grows roughly the same way.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | About 10 checks and selections |
| 100 | About 100 checks and selections |
| 1000 | About 1000 checks and selections |
Pattern observation: The work grows directly with the size of the input array.
Time Complexity: O(n)
This means the time to filter grows in a straight line as the array gets bigger.
[X] Wrong: "Boolean indexing is instant no matter the array size because it just picks elements."
[OK] Correct: Actually, it must check each element to decide if it fits the condition, so it takes longer with bigger arrays.
Understanding how filtering data scales helps you write efficient code and explain your choices clearly in real projects and interviews.
"What if we used multiple conditions combined with & or | in the mask? How would the time complexity change?"