Why is node decommissioning important in a Hadoop cluster?
Think about what happens to data when a node is removed.
Decommissioning ensures data blocks on the node are copied to other nodes to prevent data loss before the node is removed.
What is the output of the following command snippet when checking node decommissioning status?
hdfs dfsadmin -report | grep Decommissioning
Consider what the command does and what it filters.
The command lists nodes currently being decommissioned, so it shows the count and their names.
Given a Hadoop cluster with 5 nodes and replication factor 3, if one node is decommissioned, how many copies of each data block remain after decommissioning completes?
Think about how replication factor affects data safety during node removal.
The replication factor remains the same; data blocks are copied to other nodes to maintain 3 copies.
Which option shows a likely cause for a node stuck in decommissioning state indefinitely?
Consider what could block data movement during decommissioning.
If the node cannot replicate data due to network problems, it will remain stuck in decommissioning.
You want to scale your Hadoop cluster by adding 3 new nodes. Which step should you perform first to integrate them properly?
Think about how new nodes join a Hadoop cluster.
New nodes must be added to the cluster config and have DataNode services started to join the cluster properly.