0
0
Hadoopdata~20 mins

HBase data model (column families) in Hadoop - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
HBase Column Family Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
🧠 Conceptual
intermediate
2:00remaining
Understanding Column Families in HBase

In HBase, what is the primary purpose of grouping columns into column families?

ATo physically store related columns together on disk for efficient access
BTo enforce data type constraints on columns within the family
CTo automatically replicate data across different clusters
DTo define user access permissions for individual columns
Attempts:
2 left
💡 Hint

Think about how HBase organizes data on disk to improve read/write performance.

dax_lod_result
intermediate
2:00remaining
Column Family Storage Impact on Data Retrieval

Consider an HBase table with two column families: info and metrics. If you query only columns from the info family, what is the expected impact on data retrieval performance?

AOnly the <strong>info</strong> family data blocks are read, improving query speed
BBoth <strong>info</strong> and <strong>metrics</strong> families are read, no performance gain
CThe entire row is scanned regardless of column family, slowing down the query
DData from <strong>metrics</strong> family is cached automatically, speeding up retrieval
Attempts:
2 left
💡 Hint

Recall how HBase stores column families separately on disk.

data_modeling
advanced
2:30remaining
Designing Column Families for Efficient Writes

You are designing an HBase table for a sensor data application. Sensors send frequent updates for temperature and humidity. Which column family design will optimize write performance?

ACreate one column family for all sensor data (temperature and humidity together)
BCreate separate column families for temperature and humidity data
CCreate a column family for metadata and store sensor data as JSON in one column
DUse one column family but store temperature and humidity in different rows
Attempts:
2 left
💡 Hint

Think about how HBase handles writes to different column families.

🔧 Debug
advanced
2:30remaining
Identifying Column Family Misconfiguration

An HBase table has two column families: cf1 and cf2. After adding many columns to cf1, you notice slow read performance. Which misconfiguration is most likely causing this?

AUsing different column families for unrelated data
BColumn families are not compressed, causing slow reads
CToo many columns in a single column family causing large data blocks
DRow keys are not unique, causing read conflicts
Attempts:
2 left
💡 Hint

Consider how column family size affects disk I/O.

🎯 Scenario
expert
3:00remaining
Optimizing Column Family Design for Mixed Access Patterns

Your HBase table stores user profiles and their activity logs. Profiles are read frequently but updated rarely. Activity logs are written frequently but read less often. How should you design column families to optimize both read and write performance?

AStore profiles in one table and activity logs in a separate HBase table
BUse one column family for both profiles and activity logs to simplify schema
CStore profiles as JSON in one column family and activity logs as separate columns in another
DCreate separate column families: one for profiles and one for activity logs
Attempts:
2 left
💡 Hint

Think about how HBase handles read and write workloads per column family.