Disjoint set (Union-Find) in Data Structures Theory - Time & Space Complexity
We want to understand how the time needed to perform union and find operations changes as the number of elements grows in a Disjoint Set (Union-Find) data structure.
How does the cost of these operations scale when we have more elements to manage?
Analyze the time complexity of the following Union-Find operations with path compression and union by rank.
class DisjointSet:
def __init__(self, n):
self.parent = list(range(n))
self.rank = [0] * n
def find(self, x):
if self.parent[x] != x:
self.parent[x] = self.find(self.parent[x]) # path compression
return self.parent[x]
def union(self, x, y):
rootX = self.find(x)
rootY = self.find(y)
if rootX != rootY:
if self.rank[rootX] < self.rank[rootY]:
self.parent[rootX] = rootY
elif self.rank[rootX] > self.rank[rootY]:
self.parent[rootY] = rootX
else:
self.parent[rootY] = rootX
self.rank[rootX] += 1
This code manages groups of elements, quickly finding which group an element belongs to and joining groups together.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: The
findfunction uses recursion to climb up the parent links until it finds the root. - How many times: Each
findcall may traverse up to the height of the tree, but path compression flattens the tree over time. - Union calls use
findtwice and then link roots, which is a constant time operation after finds.
Initially, trees may be tall, so find takes longer. But path compression quickly flattens trees, making future finds faster.
| Input Size (n) | Approx. Operations per find/union |
|---|---|
| 10 | Very few steps, close to 1 |
| 100 | Few steps, slightly more than 1 |
| 1000 | Still very close to 1 due to flattening |
Pattern observation: The cost grows very slowly and almost stays constant for large inputs because of path compression and union by rank.
Time Complexity: O(α(n))
This means each operation takes almost constant time, where α(n) is the inverse Ackermann function, which grows so slowly it can be treated as a tiny constant for all practical input sizes.
[X] Wrong: "Each find operation always takes time proportional to the number of elements because it might traverse the whole chain."
[OK] Correct: Path compression flattens the structure during finds, so after some operations, the trees become very shallow, making finds almost constant time.
Understanding the efficient time complexity of Union-Find shows you can analyze how clever improvements affect performance, a valuable skill in many coding problems.
"What if we removed path compression from the find operation? How would the time complexity change?"