CRUD operations in HBase in Hadoop - Time & Space Complexity
When working with HBase, it is important to understand how the time to perform create, read, update, and delete actions changes as the data grows.
We want to know how the time needed grows when we add more data or perform more operations.
Analyze the time complexity of the following HBase CRUD operations.
// Put operation to add or update a row
Put put = new Put(Bytes.toBytes("row1"));
put.addColumn(Bytes.toBytes("cf"), Bytes.toBytes("qual"), Bytes.toBytes("value"));
table.put(put);
// Get operation to read a row
Get get = new Get(Bytes.toBytes("row1"));
Result result = table.get(get);
// Delete operation to remove a row
Delete delete = new Delete(Bytes.toBytes("row1"));
table.delete(delete);
This code shows how to add or update a row, read a row, and delete a row in HBase.
Look at what repeats or takes time in these operations.
- Primary operation: Accessing a single row by its key.
- How many times: Each operation works on one row at a time, so it happens once per operation.
As the number of rows in the table grows, each operation still targets one row by key.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | About 1 step per operation |
| 100 | About 1 step per operation |
| 1000 | About 1 step per operation |
Pattern observation: The time to perform each operation stays roughly the same no matter how many rows exist.
Time Complexity: O(1)
This means each create, read, update, or delete action takes about the same time regardless of table size.
[X] Wrong: "CRUD operations get slower as the table grows because there are more rows to check."
[OK] Correct: HBase uses keys and indexes to jump directly to the row, so it does not scan all rows each time.
Understanding that HBase CRUD operations run in constant time helps you explain how big data systems handle large datasets efficiently.
"What if we tried to scan the entire table instead of accessing by row key? How would the time complexity change?"