0
0
Hadoopdata~5 mins

HDFS high availability in Hadoop - Cheat Sheet & Quick Revision

Choose your learning style9 modes available
Recall & Review
beginner
What is HDFS High Availability (HA)?
HDFS High Availability means having two NameNodes: one active and one standby. This setup helps keep the system running even if one NameNode fails.
Click to reveal answer
beginner
Why do we need a standby NameNode in HDFS HA?
The standby NameNode takes over if the active NameNode fails. This prevents downtime and data loss.
Click to reveal answer
intermediate
What is the role of ZooKeeper in HDFS HA?
ZooKeeper helps manage which NameNode is active by coordinating failover automatically.
Click to reveal answer
intermediate
What is a Quorum Journal Manager (QJM) in HDFS HA?
QJM is a shared storage system that keeps edit logs from the active NameNode. Both NameNodes use it to stay in sync.
Click to reveal answer
intermediate
How does automatic failover work in HDFS HA?
If the active NameNode fails, ZooKeeper detects it and switches the standby NameNode to active, so the system keeps working without manual help.
Click to reveal answer
What does HDFS High Availability primarily prevent?
ASystem downtime due to NameNode failure
BData replication across DataNodes
CSlow data processing
DNetwork congestion
Which component helps coordinate failover in HDFS HA?
AZooKeeper
BDataNode
CResourceManager
DNameNode
What is stored in the Quorum Journal Manager?
AUser data files
BConfiguration files
CBackup snapshots
DEdit logs from the active NameNode
How many NameNodes are active at the same time in HDFS HA?
ATwo
BOne
CNone
DDepends on cluster size
What happens if the active NameNode fails in an HA setup?
AData is lost
BThe cluster stops working
CThe standby NameNode becomes active automatically
DManual restart is required
Explain how HDFS High Availability works to keep the system running without downtime.
Think about how two NameNodes share work and how the system switches between them.
You got /4 concepts.
    Describe the role of ZooKeeper and Quorum Journal Manager in HDFS High Availability.
    Focus on coordination and data sharing between NameNodes.
    You got /3 concepts.