Recall & Review
beginner
What is HDFS?
HDFS stands for Hadoop Distributed File System. It stores large files across many machines in a cluster. It is designed for batch processing and high throughput.
Click to reveal answer
beginner
What is HBase?
HBase is a NoSQL database that runs on top of HDFS. It stores data in tables with rows and columns and supports fast read/write access to large datasets.
Click to reveal answer
intermediate
How does HBase differ from HDFS in data access?
HDFS is for storing files and supports batch reading. HBase allows random, real-time read/write access to data in tables.
Click to reveal answer
intermediate
Which system is better for real-time queries: HBase or HDFS?
HBase is better for real-time queries because it supports fast random reads and writes. HDFS is better for large, sequential data processing.
Click to reveal answer
beginner
Can HBase work without HDFS?
No, HBase depends on HDFS to store its data files. HDFS provides the underlying storage layer for HBase tables.
Click to reveal answer
What type of system is HDFS?
✗ Incorrect
HDFS is a distributed file storage system designed to store large files across many machines.
Which system supports fast random read/write access?
✗ Incorrect
HBase supports fast random read/write access, unlike HDFS which is optimized for batch processing.
HBase stores data in what format?
✗ Incorrect
HBase stores data in tables with rows and columns, similar to a database.
Which system is designed for batch processing?
✗ Incorrect
HDFS is designed for batch processing of large files.
Does HBase require HDFS to function?
✗ Incorrect
HBase depends on HDFS as its underlying storage system.
Explain the main differences between HBase and HDFS in simple terms.
Think about how you store and access data differently in a file system versus a database.
You got /5 concepts.
Describe a real-life scenario where you would choose HBase over HDFS.
Imagine you want to quickly look up or update a single record in a huge dataset.
You got /5 concepts.