0
0
Hadoopdata~20 mins

MapReduce job execution flow in Hadoop - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
MapReduce Mastery
Get all challenges correct to earn this badge!
Test your skills under time pressure!
🧠 Conceptual
intermediate
2:00remaining
What is the first step in a MapReduce job execution?

In the MapReduce job execution flow, which step happens first?

ASplitting the input data into chunks
BShuffling and sorting the map output
CReducing the intermediate data
DWriting the final output to HDFS
Attempts:
2 left
💡 Hint

Think about how the data is prepared before processing.

🧠 Conceptual
intermediate
2:00remaining
What happens during the shuffle phase in MapReduce?

During the MapReduce job execution, what is the main purpose of the shuffle phase?

AExecuting the map function on input data
BSplitting the input data into smaller chunks
CWriting the final output to the distributed file system
DSorting and transferring map outputs to reducers
Attempts:
2 left
💡 Hint

Think about how data moves from mappers to reducers.

data_output
advanced
2:00remaining
Identify the output of the map phase given this input

Given the input data ["apple", "banana", "apple"] and a map function that outputs (word, 1) for each word, what is the map phase output?

Hadoop
input_data = ["apple", "banana", "apple"]
map_output = [(word, 1) for word in input_data]
print(map_output)
A[('banana', 1), ('apple', 1), ('apple', 1)]
B[('apple', 1), ('banana', 1), ('apple', 1)]
C[('apple', 1), ('banana', 1)]
D[('apple', 2), ('banana', 1)]
Attempts:
2 left
💡 Hint

The map function emits one pair per input word.

data_output
advanced
2:00remaining
What is the reducer output for this intermediate data?

Given the intermediate data {'apple': [1, 1], 'banana': [1]}, what is the output of the reducer that sums the values?

Hadoop
intermediate_data = {'apple': [1, 1], 'banana': [1]}
reducer_output = {k: sum(v) for k, v in intermediate_data.items()}
print(reducer_output)
A{'apple': 2, 'banana': 1}
B{'apple': 1, 'banana': 1}
C{'apple': [2], 'banana': [1]}
D{'apple': 3, 'banana': 1}
Attempts:
2 left
💡 Hint

The reducer sums all values for each key.

🧠 Conceptual
expert
2:00remaining
Which component manages the overall MapReduce job execution?

In Hadoop's MapReduce architecture, which component is responsible for managing the entire job execution, including resource allocation and task scheduling?

ANameNode
BTaskTracker
CJobTracker
DDataNode
Attempts:
2 left
💡 Hint

Think about the component that coordinates tasks across the cluster.