What is HDFS high availability in Hadoop?

Hadoopdata~7 mins

HDFS high availability in Hadoop

Choose your learning style9 modes available

Learn Why Deep Visual Try Challenge Project Recall Time

Introduction

HDFS high availability helps keep your data safe and accessible even if one part of the system fails. It stops your system from going offline.

When you want your data to be available all the time without interruptions.

When you have many users or applications accessing data and cannot afford downtime.

When you want to avoid losing data if a NameNode server crashes.

When running important business applications that need constant data access.

When you want to improve fault tolerance in your Hadoop cluster.

Syntax

Hadoop

1. Set up two NameNodes: one active and one standby.
2. Configure a shared storage called JournalNode for edit logs.
3. Use ZooKeeper to manage automatic failover between NameNodes.
4. Update hdfs-site.xml with HA settings including dfs.nameservices, dfs.ha.namenodes, dfs.namenode.rpc-address, dfs.namenode.http-address, dfs.namenode.shared.edits.dir, and dfs.client.failover.proxy.provider.
5. Start JournalNodes, ZooKeeper, and both NameNodes.
6. Use hdfs commands to check the status and switch active NameNode if needed.

This setup requires careful configuration of multiple components.

ZooKeeper helps decide which NameNode is active to avoid conflicts.

Examples

This XML config shows how to define two NameNodes and shared edits for HA.

Hadoop

# Example hdfs-site.xml snippet for HA
<configuration>
  <property>
    <name>dfs.nameservices</name>
    <value>mycluster</value>
  </property>
  <property>
    <name>dfs.ha.namenodes.mycluster</name>
    <value>nn1,nn2</value>
  </property>
  <property>
    <name>dfs.namenode.rpc-address.mycluster.nn1</name>
    <value>host1:8020</value>
  </property>
  <property>
    <name>dfs.namenode.rpc-address.mycluster.nn2</name>
    <value>host2:8020</value>
  </property>
  <property>
    <name>dfs.namenode.http-address.mycluster.nn1</name>
    <value>host1:50070</value>
  </property>
  <property>
    <name>dfs.namenode.http-address.mycluster.nn2</name>
    <value>host2:50070</value>
  </property>
  <property>
    <name>dfs.namenode.shared.edits.dir</name>
    <value>qjournal://host1:8485;host2:8485;host3:8485/mycluster</value>
  </property>
  <property>
    <name>dfs.client.failover.proxy.provider.mycluster</name>
    <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
  </property>
</configuration>

This setup has no standby NameNode, so no high availability.

Hadoop

# Edge case: Only one NameNode configured (no HA)
<configuration>
  <property>
    <name>dfs.nameservices</name>
    <value>mycluster</value>
  </property>
  <property>
    <name>dfs.ha.namenodes.mycluster</name>
    <value>nn1</value>
  </property>
</configuration>

If standby fails, active keeps working. This shows fault tolerance.

Hadoop

# Edge case: Standby NameNode is down
# The active NameNode continues serving data.
# Failover can be manual or automatic if ZooKeeper is configured.

Sample Program

This script shows how to check which NameNode is active or standby and how to switch active NameNode manually.

Hadoop

# This is a shell script example to check HDFS HA status

# Check the status of NameNodes
hdfs haadmin -getServiceState nn1
hdfs haadmin -getServiceState nn2

# Manually failover from nn1 to nn2
hdfs haadmin -failover nn1 nn2

# Check status again
hdfs haadmin -getServiceState nn1
hdfs haadmin -getServiceState nn2

OutputSuccess

Important Notes

Time complexity: HA setup adds some overhead but keeps system responsive.

Space complexity: Requires extra storage for shared edits and standby NameNode.

Common mistake: Not configuring ZooKeeper correctly can cause failover to fail.

Use HA when you need zero downtime. For small clusters, simple single NameNode may be enough.

Summary

HDFS high availability uses two NameNodes to avoid downtime.

Shared storage and ZooKeeper help manage failover automatically.

Proper configuration is key to keep data safe and accessible.