Practice - 5 Tasks
Answer the questions below
1fill in blank
easyComplete the code to define the reduce function that sums values for each key.
Hadoop
def reduce(key, values): result = 0 for value in values: result += [1] return (key, result)
Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Using 'key' instead of 'value' inside the loop.
Adding 'values' instead of individual 'value'.
✗ Incorrect
The reduce function sums each value in the list of values for a given key, so we add 'value' to the result.
2fill in blank
mediumComplete the code to emit the final key-value pair from the reduce function.
Hadoop
def reduce(key, values): total = sum(values) [1](key, total)
Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Using 'print' instead of 'emit'.
Using 'return' which does not output to Hadoop.
✗ Incorrect
In Hadoop's reduce phase, the 'emit' function outputs the final key and its aggregated value.
3fill in blank
hardFix the error in the reduce function to correctly sum values.
Hadoop
def reduce(key, values): total = 0 for val in values: total = total [1] val emit(key, total)
Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Using '-' which subtracts values.
Using '*' or '/' which multiply or divide.
✗ Incorrect
To sum values, use the '+' operator to add each val to total.
4fill in blank
hardFill both blanks to create a dictionary of word counts using reduce logic.
Hadoop
word_counts = {word: [1] for word in words if len(word) [2] 3} Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Using 'len(word)' as count instead of 'words.count(word)'.
Using '<=' instead of '>' for filtering.
✗ Incorrect
We count occurrences of each word with 'words.count(word)' and filter words longer than 3 with '>'.
5fill in blank
hardFill all three blanks to create a reduce function that filters and sums values.
Hadoop
def reduce(key, values): filtered = [v for v in values if v [1] 10] total = sum(filtered) if total [2] 0: [3](key, total)
Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Using '<' instead of '>' for filtering.
Using 'print' instead of 'emit' to output.
✗ Incorrect
Filter values greater than 10 with '>', check if total is >= 0, then emit the result.