How to Check Hadoop Cluster Health Quickly and Easily
To check Hadoop cluster health, use the
hdfs dfsadmin -report command to get storage and node status, and visit the ResourceManager web UI at http://:8088 for real-time cluster metrics. These tools show node availability, disk usage, and running jobs to ensure your cluster is healthy.Syntax
The main command to check Hadoop cluster health is hdfs dfsadmin -report. This command shows the status of DataNodes, storage capacity, and overall cluster health.
Additionally, the ResourceManager web UI provides a dashboard for monitoring cluster metrics and job status.
bash
hdfs dfsadmin -report
Example
This example runs the hdfs dfsadmin -report command to display the health of the Hadoop cluster, including live and dead DataNodes, storage usage, and block information.
bash
hdfs dfsadmin -report
Output
Configured Capacity: 1000000000 (953.67 MB)
Present Capacity: 900000000 (858.31 MB)
DFS Remaining: 700000000 (667.57 MB)
DFS Used: 200000000 (190.73 MB)
DFS Used%: 22.22%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
Live datanodes (3):
Name: datanode1:50010
Hostname: datanode1
Decommission Status : Normal
Configured Capacity: 333333333 (317.89 MB)
DFS Used: 66666666 (63.58 MB)
DFS Remaining: 266666667 (254.31 MB)
Name: datanode2:50010
Hostname: datanode2
Decommission Status : Normal
Configured Capacity: 333333333 (317.89 MB)
DFS Used: 66666666 (63.58 MB)
DFS Remaining: 266666667 (254.31 MB)
Name: datanode3:50010
Hostname: datanode3
Decommission Status : Normal
Configured Capacity: 333333334 (317.89 MB)
DFS Used: 66666668 (63.58 MB)
DFS Remaining: 266666666 (254.31 MB)
Common Pitfalls
- Running
hdfs dfsadmin -reportwithout proper permissions can cause errors; always run as the Hadoop user or with sudo. - Checking only the command line report misses real-time job and resource info; always use the ResourceManager web UI for full health status.
- Ignoring dead DataNodes can cause data loss; monitor and fix node failures promptly.
bash
Wrong: hdfs dfsadmin -report # run as a non-privileged user may fail Right: sudo -u hadoop hdfs dfsadmin -report # run as hadoop user to get full report
Quick Reference
Use these tools to check Hadoop cluster health:
| Tool | Purpose | Access |
|---|---|---|
| hdfs dfsadmin -report | Shows DataNode status and storage usage | Command line |
| ResourceManager Web UI | Monitors cluster resource usage and job status | http:// |
| NameNode Web UI | Shows HDFS file system health and block info | http:// |
Key Takeaways
Use
hdfs dfsadmin -report to get a quick summary of cluster storage and node health.Check the ResourceManager web UI for real-time cluster resource and job monitoring.
Always monitor for dead DataNodes and fix them to avoid data loss.
Run health check commands with proper permissions to avoid errors.
Combine command line and web UI tools for full cluster health insight.