0
0
Hadoopdata~20 mins

Block storage and replication in Hadoop - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
Block Storage and Replication Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
🧠 Conceptual
intermediate
2:00remaining
Understanding HDFS Block Replication Factor
In Hadoop Distributed File System (HDFS), what happens if the replication factor of a file is set to 1?
AThe file is stored as a single block on one DataNode without any copies.
BThe file is split into multiple blocks, each replicated twice on different DataNodes.
CThe file is stored on the NameNode only without any DataNode storage.
DThe file is replicated on all DataNodes in the cluster.
Attempts:
2 left
💡 Hint
Think about what replication factor means for data copies.
data_output
intermediate
2:00remaining
Calculating Number of Blocks for a File
Given a file size of 450 MB and HDFS block size of 128 MB, how many blocks will HDFS create to store this file?
A3 blocks
B5 blocks
C4 blocks
D6 blocks
Attempts:
2 left
💡 Hint
Divide file size by block size and round up.
Predict Output
advanced
2:00remaining
HDFS Block Replication Command Output
What is the output of the following HDFS command? hdfs fsck /user/data/file.txt -files -blocks -racks
Hadoop
Sample output snippet:

Status: HEALTHY
 Total size: 2560 B
 Total blocks: 2 (avg. block size 1280 B)
 Minimally replicated blocks: 2
 Over-replicated blocks: 0
 Under-replicated blocks: 0

Block replica on datanode1:50010
Block replica on datanode2:50010
AThe file has 2 blocks but some blocks are under-replicated.
BThe file is missing blocks and is unhealthy.
CThe file has 1 block with over-replication.
DThe file has 2 blocks, each fully replicated with no under or over replication.
Attempts:
2 left
💡 Hint
Look at the replication status lines carefully.
visualization
advanced
2:00remaining
Visualizing Block Distribution Across DataNodes
You have a file split into 3 blocks with replication factor 3. Which visualization best represents the block distribution across 3 DataNodes?
AEach block is stored on exactly 3 different DataNodes, with blocks distributed evenly.
BEach block is stored on exactly one DataNode only.
CEach DataNode stores all 3 blocks (full replication on each node).
DBlocks are stored randomly with some blocks missing replicas.
Attempts:
2 left
💡 Hint
Replication factor 3 means 3 copies of each block on different nodes.
🔧 Debug
expert
2:00remaining
Diagnosing Under-Replication in HDFS
You notice that the HDFS report shows some blocks as under-replicated. Which of the following is the most likely cause?
AThe replication factor is set to 1 for the file.
BOne or more DataNodes storing replicas are down or unreachable.
CThe file size is smaller than the block size.
DThe NameNode is running out of memory.
Attempts:
2 left
💡 Hint
Think about what causes missing replicas in a distributed system.