Collision handling (chaining) in Data Structures Theory - Time & Space Complexity
When storing data in a hash table, sometimes two items want the same spot. This is called a collision.
We want to understand how handling collisions by chaining affects the time it takes to find or add items.
Analyze the time complexity of inserting and searching using chaining in a hash table.
class HashTable:
def __init__(self, size):
self.size = size
self.table = [[] for _ in range(size)]
def insert(self, key, value):
index = hash(key) % self.size
self.table[index].append((key, value))
def search(self, key):
index = hash(key) % self.size
for k, v in self.table[index]:
if k == key:
return v
return None
This code stores items in buckets (lists) at positions based on the key's hash. Collisions are handled by adding items to the list at that bucket.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: Looping through the list (chain) at a bucket during search.
- How many times: Depends on how many items collided and are stored in the same bucket.
As more items are added, the chains in each bucket can grow longer if the table size stays the same.
| Input Size (n) | Approx. Operations (chain length) |
|---|---|
| 10 | About 1 or 2 items per bucket |
| 100 | About 10 items per bucket if size is 10 |
| 1000 | About 100 items per bucket if size is 10 |
Pattern observation: If the number of buckets stays fixed, the time to search or insert grows roughly in proportion to the number of items per bucket.
Time Complexity: O(n/k) where n is number of items and k is number of buckets
This means the time depends on how many items share the same bucket. If buckets grow with items, it stays close to O(1).
[X] Wrong: "Searching in a chained hash table is always O(1) no matter what."
[OK] Correct: If many items collide in the same bucket, the chain grows and search time increases. So it depends on how well the hash spreads items and the table size.
Understanding collision handling with chaining helps you explain how hash tables work in real systems and why choosing a good hash function and table size matters.
"What if we doubled the number of buckets as we add more items? How would the time complexity change?"