0
0
DbmsConceptBeginner · 3 min read

What is Hash Index in DBMS: Explanation and Use Cases

A hash index is a type of database index that uses a hash function to map keys to specific locations for fast data lookup. It allows quick access to rows by converting search keys into a hash value, which points directly to the data location.
⚙️

How It Works

A hash index works like a smart address book. Imagine you want to find a friend's phone number quickly. Instead of searching every entry, you use a special formula (hash function) on their name to get a unique number. This number tells you exactly where to look in the address book.

In databases, the hash function takes the search key (like a user ID) and converts it into a hash value. This value points to a specific bucket or slot where the data is stored. Because the hash function directly calculates the location, the database can find the data very fast without scanning the whole table.

However, if two keys produce the same hash value (called a collision), the database uses methods like chaining or open addressing to handle them, ensuring data is still found correctly.

💻

Example

This example shows a simple hash index simulation in Python. It uses a hash function to store and retrieve values quickly.

python
class HashIndex:
    def __init__(self, size=10):
        self.size = size
        self.buckets = [[] for _ in range(size)]

    def _hash(self, key):
        return hash(key) % self.size

    def insert(self, key, value):
        index = self._hash(key)
        # Check if key exists and update
        for i, (k, v) in enumerate(self.buckets[index]):
            if k == key:
                self.buckets[index][i] = (key, value)
                return
        # Otherwise, add new key-value
        self.buckets[index].append((key, value))

    def search(self, key):
        index = self._hash(key)
        for k, v in self.buckets[index]:
            if k == key:
                return v
        return None

# Create hash index
index = HashIndex()
index.insert('apple', 100)
index.insert('banana', 200)
index.insert('orange', 300)

# Search values
print(index.search('banana'))
print(index.search('grape'))
Output
200 None
🎯

When to Use

Use a hash index when you need very fast lookups for exact matches, such as finding a record by a unique ID or key. It is ideal for queries that use equality conditions (e.g., WHERE id = 123).

Hash indexes are less suitable for range queries (like finding values between 10 and 20) because hash functions do not preserve order. They are commonly used in key-value stores, caching systems, and databases where quick direct access is critical.

For example, a user authentication system might use a hash index on usernames to quickly find user details during login.

Key Points

  • A hash index uses a hash function to map keys to data locations.
  • It provides very fast exact-match lookups.
  • Not suitable for range or ordered queries.
  • Handles collisions using chaining or open addressing.
  • Commonly used in key-value databases and caching.

Key Takeaways

A hash index speeds up data retrieval by converting keys into direct data locations using a hash function.
It is best for exact-match queries but not for range searches.
Collisions in hash indexes are managed to ensure reliable data access.
Hash indexes are widely used in systems needing fast key-based lookups like caches and key-value stores.