0
0
Hadoopdata~20 mins

Reduce phase explained in Hadoop - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
Reduce Phase Mastery
Get all challenges correct to earn this badge!
Test your skills under time pressure!
🧠 Conceptual
intermediate
1:30remaining
What is the main role of the Reduce phase in Hadoop MapReduce?

In Hadoop MapReduce, after the Map phase processes input data, the Reduce phase takes over. What is the main role of the Reduce phase?

AIt sorts the input data before mapping.
BIt splits the input data into smaller chunks for processing.
CIt aggregates and summarizes the intermediate data produced by the Map phase.
DIt reads raw input data from the source files.
Attempts:
2 left
💡 Hint

Think about what happens after the Map phase outputs key-value pairs.

Predict Output
intermediate
1:30remaining
Output of Reduce function aggregation

Consider the following simplified Reduce function in Hadoop MapReduce that sums values for each key:

def reduce(key, values):
    total = 0
    for v in values:
        total += v
    print(f"{key}: {total}")

What will be the output if the input to reduce is key = 'apple' and values = [2, 3, 5]?

Hadoop
def reduce(key, values):
    total = 0
    for v in values:
        total += v
    print(f"{key}: {total}")

reduce('apple', [2, 3, 5])
Aapple: 10
Bapple: 235
Capple: 0
Dapple: 5
Attempts:
2 left
💡 Hint

Sum all numbers in the list.

data_output
advanced
2:00remaining
Resulting key-value pairs after Reduce phase

Given the following intermediate key-value pairs from the Map phase:

{'cat': [1, 1, 1], 'dog': [1, 1], 'bird': [1]}

If the Reduce phase sums the values for each key, what is the resulting output?

A{'cat': 1, 'dog': 1, 'bird': 1}
B{'cat': 3, 'dog': 2, 'bird': 1}
C{'cat': [3], 'dog': [2], 'bird': [1]}
D{'cat': 6, 'dog': 4, 'bird': 2}
Attempts:
2 left
💡 Hint

Sum the list of values for each key.

🔧 Debug
advanced
1:30remaining
Identify the error in this Reduce function

Look at this Reduce function code snippet:

def reduce(key, values):
    total = 0
    for v in values
        total += v
    print(f"{key}: {total}")

What error will this code produce when run?

ANameError because total is not defined
BIndentationError due to wrong indentation
CTypeError because values is not iterable
DSyntaxError due to missing colon after for loop
Attempts:
2 left
💡 Hint

Check the for loop syntax carefully.

🚀 Application
expert
2:00remaining
Choosing the correct Reduce phase output for word count

In a word count MapReduce job, the Map phase outputs key-value pairs where the key is a word and the value is 1 for each occurrence. The Reduce phase sums these counts. Given the following Map output for the word 'data':

[('data', 1), ('data', 1), ('data', 1), ('data', 1)]

Which of the following is the correct Reduce phase output for the key 'data'?

A('data', 4)
B('data', [1, 1, 1, 1])
C('data', 1)
D('data', 0)
Attempts:
2 left
💡 Hint

Sum all the counts for the word.