0
0
Hadoopdata~20 mins

Backup and disaster recovery in Hadoop - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
Hadoop Backup and Recovery Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
🧠 Conceptual
intermediate
2:00remaining
Understanding Hadoop Backup Strategies

Which of the following best describes the primary purpose of Hadoop's DistCp tool in backup and disaster recovery?

AIt compresses Hadoop data files to save storage space during backup.
BIt copies large amounts of data efficiently between Hadoop clusters for backup or migration.
CIt monitors Hadoop cluster health to prevent data loss.
DIt encrypts Hadoop data before backup to ensure security.
Attempts:
2 left
💡 Hint

Think about tools designed to move data between clusters.

Predict Output
intermediate
2:00remaining
HDFS Snapshot Output

What will be the output of the following HDFS command sequence?

hdfs dfs -mkdir /data
hdfs dfs -put file1.txt /data/
hdfs dfs -createSnapshot /data snap1
hdfs dfs -rm /data/file1.txt
hdfs dfs -ls /data/.snapshot/snap1
ALists all files except file1.txt in the snapshot.
BShows an empty directory because file1.txt was deleted.
CLists file1.txt inside the snapshot directory.
DReturns an error because snapshots cannot be created on /data.
Attempts:
2 left
💡 Hint

Remember what snapshots do in HDFS.

data_output
advanced
2:00remaining
Analyzing Backup Data Size Reduction

You run a Hadoop backup job that uses compression. The original data size is 500 GB. After backup, the compressed backup size is 150 GB. What is the compression ratio?

A3.33
B0.3
C350
D0.03
Attempts:
2 left
💡 Hint

Compression ratio = original size / compressed size.

🔧 Debug
advanced
2:00remaining
Identifying Error in Hadoop Backup Script

Consider this Hadoop backup script snippet:

hdfs dfs -mkdir /backup
hadoop distcp /user/data /backup/data_backup
hdfs dfs -rm -r /user/data

What is the main risk or error in this script?

AThe rm command will fail because /user/data is not empty.
BThe distcp command syntax is incorrect and will cause a syntax error.
CThe backup directory /backup does not exist before copying.
DThe script deletes original data immediately after copying without verifying backup success.
Attempts:
2 left
💡 Hint

Think about safe backup practices.

🚀 Application
expert
3:00remaining
Designing a Disaster Recovery Plan for Hadoop

You are tasked with designing a disaster recovery plan for a Hadoop cluster that must minimize downtime and data loss. Which combination of strategies is best?

AUse HDFS snapshots regularly and replicate data to a remote cluster using DistCp.
BOnly rely on HDFS replication factor set to 3 within the same cluster.
CBackup data manually once a month and store on local disks.
DUse a single-node cluster backup to external USB drives weekly.
Attempts:
2 left
💡 Hint

Consider both data safety and recovery speed.