0
0
NumPydata~5 mins

np.union1d() for union in NumPy - Time & Space Complexity

Choose your learning style9 modes available
Time Complexity: np.union1d() for union
O(n log n)
Understanding Time Complexity

We want to understand how the time needed to find the union of two arrays changes as the arrays get bigger.

Specifically, how does np.union1d() behave when input size grows?

Scenario Under Consideration

Analyze the time complexity of the following code snippet.

import numpy as np

arr1 = np.array([1, 2, 3, 4])
arr2 = np.array([3, 4, 5, 6])

result = np.union1d(arr1, arr2)
print(result)

This code finds all unique elements from both arrays combined, sorted in order.

Identify Repeating Operations

Identify the loops, recursion, array traversals that repeat.

  • Primary operation: Sorting and merging the two arrays after removing duplicates.
  • How many times: The arrays are traversed multiple times internally, mainly during sorting and merging steps.
How Execution Grows With Input

As the size of the input arrays grows, the time to find the union grows a bit faster than the size itself.

Input Size (n)Approx. Operations
10About 10 log 10 (around 33)
100About 100 log 100 (around 664)
1000About 1000 log 1000 (around 9966)

Pattern observation: The operations grow a bit faster than the input size, roughly multiplying by the input size times the log of the input size.

Final Time Complexity

Time Complexity: O(n log n)

This means the time needed grows a bit faster than the input size, mainly because sorting is involved.

Common Mistake

[X] Wrong: "np.union1d() just scans the arrays once, so it runs in linear time O(n)."

[OK] Correct: Internally, np.union1d() sorts the combined data, which takes more time than just scanning once. So it is slower than linear time.

Interview Connect

Understanding how functions like np.union1d() scale helps you reason about performance in data tasks. This skill shows you can think about efficiency, which is valuable in real projects.

Self-Check

"What if the input arrays were already sorted? How would that affect the time complexity of np.union1d()?"