Challenge - 5 Problems

🎖️

Hadoop Ecosystem Master

Get all challenges correct to earn this badge!

Test your skills under time pressure!

🧠 Conceptual

intermediate

2:00remaining

Core components of Hadoop ecosystem

Which of the following is NOT a core component of the Hadoop ecosystem?

AHDFS (Hadoop Distributed File System)

BYARN (Yet Another Resource Negotiator)

CMapReduce

DHive

Attempts:

2 left

❓ Predict Output

intermediate

2:00remaining

YARN resource management output

What will be the output of the following YARN command simulation in a Hadoop cluster?

yarn node -list

Assuming the cluster has 3 nodes registered and all are healthy.

AError: Command not found

Total Nodes: 3
Node-1: Unhealthy
Node-2: Healthy
Node-3: Healthy

Total Nodes: 3
Node-1: Healthy
Node-2: Healthy
Node-3: Healthy

Total Nodes: 0
No nodes registered

Attempts:

2 left

❓ data_output

advanced

2:00remaining

MapReduce job output count

Given a MapReduce job that processes 1000 input records and outputs key-value pairs, which option correctly shows the number of output records if the reducer combines all values by key and there are 10 unique keys?

A10 output records

B1000 output records

C100 output records

D0 output records

Attempts:

2 left

❓ visualization

advanced

2:00remaining

Hadoop ecosystem tool usage visualization

Which visualization best represents the relationship between Hadoop ecosystem tools: HDFS, YARN, MapReduce, Hive, and Pig?

AHDFS at the base, YARN managing resources, MapReduce processing data, Hive and Pig as query layers on top

BMapReduce at the base, HDFS managing resources, YARN processing data, Hive and Pig as storage layers

CPig at the base, Hive managing resources, HDFS processing data, YARN and MapReduce as query layers

DHive at the base, MapReduce managing resources, YARN processing data, HDFS and Pig as query layers

Attempts:

2 left

🔧 Debug

expert

2:00remaining

Diagnosing a failed MapReduce job

A MapReduce job fails with the error: 'java.lang.OutOfMemoryError: Java heap space'. Which option is the most likely cause?

AYARN is not running on the cluster

BThe reducer is trying to load too much data into memory at once

CThe input data is too small for the job to run

DHDFS is full and cannot store output

Attempts:

2 left