In an HDFS high availability setup, what is the primary role of the standby NameNode?
Think about what happens when the active NameNode stops working.
The standby NameNode keeps a synchronized copy of the namespace and can quickly take over if the active NameNode fails, ensuring no downtime.
What is the purpose of the Quorum Journal Manager (QJM) in HDFS high availability?
Consider how edit logs are shared between NameNodes.
The Quorum Journal Manager replicates edit logs across multiple JournalNodes, ensuring that both active and standby NameNodes have consistent state information.
Given the following simplified log snippet from an HDFS cluster during failover, what is the final state of the NameNodes?
2024-04-01 10:00:00 Active NameNode started
2024-04-01 10:05:00 Standby NameNode synchronized
2024-04-01 10:10:00 Active NameNode failed
2024-04-01 10:10:05 Standby NameNode transitioned to active
2024-04-01 10:15:00 New standby NameNode started
Look for the transition events after the active NameNode failure.
After the active NameNode failed, the standby took over as active, and a new standby NameNode was started to maintain high availability.
In an HDFS high availability setup, a split-brain scenario occurred where both NameNodes became active simultaneously. Which misconfiguration below most likely caused this?
Think about what ensures only one NameNode is active at a time.
If JournalNodes are not configured or unreachable, the active and standby NameNodes cannot coordinate edit logs, causing both to become active and leading to split-brain.
You are tasked with designing an HDFS high availability cluster for a critical application. Which combination of components and configurations below ensures maximum resilience and zero downtime during failover?
Consider components that coordinate failover automatically and maintain consistent state.
Using two NameNodes with one active and one standby, three JournalNodes to maintain quorum for edit log replication, and automatic failover managed by ZooKeeper ensures high availability and zero downtime.