File organization (heap, sequential, hashing) in DBMS Theory - Time & Space Complexity
When working with databases, how fast we can find or store data depends on how files are organized.
We want to understand how the time to access or add data changes as the data grows.
Analyze the time complexity of searching a record in different file organizations.
-- Heap file search
SELECT * FROM heap_file WHERE key = 'value';
-- Sequential file search
SELECT * FROM sequential_file WHERE key = 'value';
-- Hash file search
SELECT * FROM hash_file WHERE key = 'value';
This shows searching for a record by key in three file types: heap, sequential, and hash organized files.
Look at how many records we check to find the target.
- Heap file: Scans records one by one until found or end reached.
- Sequential file: Scans records in order until key found or passed.
- Hash file: Uses a hash function to jump directly to the record's location.
- Dominant operation: Number of record checks or jumps to find the target.
As the number of records (n) grows, the search time changes differently for each file type.
| Input Size (n) | Heap Search Checks | Sequential Search Checks | Hash Search Checks |
|---|---|---|---|
| 10 | Up to 10 | Up to 10 | About 1 |
| 100 | Up to 100 | Up to 100 | About 1 |
| 1000 | Up to 1000 | Up to 1000 | About 1 |
Heap and sequential searches grow linearly with data size, while hash search stays almost constant.
Time Complexity: O(n) for heap and sequential, O(1) average case for hashing
This means heap and sequential searches take longer as data grows, but hashing finds records quickly regardless of size in average cases.
[X] Wrong: "Hashing always guarantees instant access without any delay."
[OK] Correct: Hashing can have collisions, causing some extra checks, so access is usually very fast but not always instant.
Understanding these file organizations helps you explain how databases handle data efficiently, a key skill in many database and software roles.
"What if the hash function causes many collisions? How would that affect the time complexity of searching in a hash file?"