0
0
Hadoopdata~20 mins

Word count as MapReduce example in Hadoop - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
MapReduce Word Count Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
Predict Output
intermediate
2:00remaining
Output of the Mapper function in Word Count
Given the following Mapper code snippet in Hadoop MapReduce for word count, what is the output for the input line: "hello world hello"?
Hadoop
public static class TokenizerMapper extends Mapper<Object, Text, Text, IntWritable> {
    private final static IntWritable one = new IntWritable(1);
    private Text word = new Text();

    public void map(Object key, Text value, Context context) throws IOException, InterruptedException {
        StringTokenizer itr = new StringTokenizer(value.toString());
        while (itr.hasMoreTokens()) {
            word.set(itr.nextToken());
            context.write(word, one);
        }
    }
}
A(hello,1), (world,1), (hello,1)
B(hello,2), (world,1)
C(hello world hello, 1)
D(hello,1), (world,2)
Attempts:
2 left
💡 Hint
The Mapper outputs each word with a count of 1 for every occurrence.
Predict Output
intermediate
2:00remaining
Output of the Reducer function in Word Count
Given the following Reducer code snippet in Hadoop MapReduce for word count, what is the output for the input key "hello" and values [1, 1, 1]?
Hadoop
public static class IntSumReducer extends Reducer<Text, IntWritable, Text, IntWritable> {
    private IntWritable result = new IntWritable();

    public void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException {
        int sum = 0;
        for (IntWritable val : values) {
            sum += val.get();
        }
        result.set(sum);
        context.write(key, result);
    }
}
A(hello, 0)
B(hello, 3)
C(hello, 1)
D(hello, 6)
Attempts:
2 left
💡 Hint
The Reducer sums all counts for the same word.
data_output
advanced
2:00remaining
Final output of Word Count MapReduce job
Suppose the input text file contains the lines:
hello world
hello hadoop
What is the final output of the Word Count MapReduce job?
A(hello, 1), (world, 1), (hadoop, 1)
B(hello, 3), (world, 1), (hadoop, 1)
C(hello, 2), (world, 1), (hadoop, 1)
D(hello world, 1), (hello hadoop, 1)
Attempts:
2 left
💡 Hint
Count how many times each word appears in all lines combined.
🔧 Debug
advanced
2:00remaining
Identify the error in this Mapper code
What error will this Mapper code produce when run?
public static class TokenizerMapper extends Mapper {
    private final static IntWritable one = new IntWritable(1);
    private Text word = new Text();

    public void map(Object key, Text value, Context context) throws IOException, InterruptedException {
        StringTokenizer itr = new StringTokenizer(value.toString());
        while (itr.hasMoreTokens()) {
            word.set(itr.nextToken());
            context.write(word, one);
        }
    }
}
ANo error, runs correctly
BClassCastException at runtime
CNullPointerException at runtime
DSyntaxError due to missing semicolon
Attempts:
2 left
💡 Hint
Check punctuation carefully in Java code.
🚀 Application
expert
3:00remaining
Choosing the correct Combiner function for Word Count
Which of the following Combiner implementations correctly optimizes the Word Count MapReduce job by reducing data transfer between Mapper and Reducer?
A
public static class IntSumCombiner extends Reducer&lt;Text, IntWritable, Text, IntWritable&gt; {
    private IntWritable result = new IntWritable();
    public void reduce(Text key, Iterable&lt;IntWritable&gt; values, Context context) throws IOException, InterruptedException {
        int sum = 0;
        for (IntWritable val : values) {
            sum += val.get();
        }
        result.set(sum);
        context.write(key, result);
    }
}
B
public static class IntSumCombiner extends Mapper&lt;Text, IntWritable, Text, IntWritable&gt; {
    public void map(Text key, IntWritable value, Context context) throws IOException, InterruptedException {
        context.write(key, value);
    }
}
C
public static class IntSumCombiner extends Reducer&lt;Text, Text, Text, IntWritable&gt; {
    public void reduce(Text key, Iterable&lt;Text&gt; values, Context context) throws IOException, InterruptedException {
        int sum = 0;
        for (Text val : values) {
            sum += Integer.parseInt(val.toString());
        }
        context.write(key, new IntWritable(sum));
    }
}
D
public static class IntSumCombiner extends Reducer&lt;Text, IntWritable, Text, Text&gt; {
    public void reduce(Text key, Iterable&lt;IntWritable&gt; values, Context context) throws IOException, InterruptedException {
        int sum = 0;
        for (IntWritable val : values) {
            sum += val.get();
        }
        context.write(key, new Text(String.valueOf(sum)));
    }
}
Attempts:
2 left
💡 Hint
The Combiner must have the same input and output types as the Reducer.