Hash indexes in DBMS Theory - Time & Space Complexity
When using hash indexes in databases, it is important to understand how quickly data can be found or stored.
We want to know how the time to find or insert data changes as the amount of data grows.
Analyze the time complexity of searching for a record using a hash index.
-- Assume a hash index on column 'id' in table 'users'
SELECT * FROM users WHERE id = 12345;
-- The database uses the hash index to find the record quickly
This query uses a hash index to find a user by their unique ID.
Look for repeated steps in the search process.
- Primary operation: Computing the hash function for the key and accessing the bucket.
- How many times: Usually once, unless there is a collision requiring a few checks.
Finding a record with a hash index usually takes about the same time no matter how many records there are.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | 1-2 operations |
| 100 | 1-2 operations |
| 1000 | 1-2 operations |
Pattern observation: The number of steps stays almost the same as data grows, thanks to direct access via hashing.
Time Complexity: O(1)
This means the time to find a record using a hash index stays constant, no matter how big the data is.
[X] Wrong: "Hash indexes always take longer as the table grows because they check every record."
[OK] Correct: Hash indexes use a hash function to jump directly to the right place, so they don't scan all records.
Understanding hash indexes helps you explain how databases quickly find data, a useful skill in many technical discussions.
"What if the hash function causes many collisions? How would the time complexity change then?"