np.intersect1d() for intersection in NumPy - Time & Space Complexity
We want to understand how the time to find common elements between two arrays grows as the arrays get bigger.
How does the work increase when the input arrays grow in size?
Analyze the time complexity of the following code snippet.
import numpy as np
arr1 = np.array([1, 2, 3, 4, 5])
arr2 = np.array([3, 4, 5, 6, 7])
common = np.intersect1d(arr1, arr2)
print(common)
This code finds the common elements between two arrays using np.intersect1d.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: Sorting both input arrays and then scanning them to find common elements.
- How many times: Each array is processed once during sorting (which involves multiple comparisons), then a single pass through both arrays to find intersections.
As the size of the input arrays grows, the sorting step takes more time, and the scanning step grows linearly.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | About 10 * log(10) operations for sorting, plus 10 operations for scanning |
| 100 | About 100 * log(100) operations for sorting, plus 100 operations for scanning |
| 1000 | About 1000 * log(1000) operations for sorting, plus 1000 operations for scanning |
Pattern observation: The sorting dominates and grows a bit faster than the input size, while scanning grows directly with input size.
Time Complexity: O(n log n)
This means the time to find the intersection grows a bit faster than the size of the input arrays because of sorting.
[X] Wrong: "Finding common elements is just a simple loop, so it must be O(n)."
[OK] Correct: The function sorts the arrays first, which takes more time than just looping, so the overall time is more than just O(n).
Understanding how sorting affects performance helps you explain and improve data operations in real projects.
"What if the input arrays were already sorted? How would the time complexity change?"