Practice - 5 Tasks
Answer the questions below
1fill in blank
easyComplete the code to emit each word with count 1 in the mapper.
Hadoop
public static class TokenizerMapper extends Mapper<Object, Text, Text, IntWritable> { private final static IntWritable one = new IntWritable(1); private Text word = new Text(); public void map(Object key, Text value, Context context) throws IOException, InterruptedException { StringTokenizer itr = new StringTokenizer(value.toString()); while (itr.hasMoreTokens()) { word.set(itr.[1]()); context.write(word, one); } } }
Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Using next() which does not exist in StringTokenizer.
Using nextWord() which is not a method of StringTokenizer.
✗ Incorrect
The StringTokenizer method to get the next token is nextToken().
2fill in blank
mediumComplete the reducer code to sum counts for each word.
Hadoop
public static class IntSumReducer extends Reducer<Text, IntWritable, Text, IntWritable> { private IntWritable result = new IntWritable(); public void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException { int sum = 0; for (IntWritable val : values) { sum [1] val.get(); } result.set(sum); context.write(key, result); } }
Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Using -= which subtracts instead of adding.
Using *= or /= which multiply or divide instead of adding.
✗ Incorrect
We add each count to sum using the += operator.
3fill in blank
hardFix the error in the mapper setup method to initialize the Text object.
Hadoop
public static class TokenizerMapper extends Mapper<Object, Text, Text, IntWritable> { private final static IntWritable one = new IntWritable(1); private Text word; protected void setup(Context context) throws IOException, InterruptedException { word = new [1](); } public void map(Object key, Text value, Context context) throws IOException, InterruptedException { StringTokenizer itr = new StringTokenizer(value.toString()); while (itr.hasMoreTokens()) { word.set(itr.nextToken()); context.write(word, one); } } }
Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Using String instead of Text causes type errors.
Using IntWritable or Writable which are incorrect types here.
✗ Incorrect
The Text object must be initialized with new Text() in Hadoop MapReduce.
4fill in blank
hardFill both blanks to complete the driver code setting input and output paths.
Hadoop
public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();
Job job = Job.getInstance(conf, "word count");
job.setJarByClass(WordCount.class);
job.setMapperClass(TokenizerMapper.class);
job.setCombinerClass(IntSumReducer.class);
job.setReducerClass(IntSumReducer.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);
FileInputFormat.[1](job, new Path(args[0]));
FileOutputFormat.[2](job, new Path(args[1]));
System.exit(job.waitForCompletion(true) ? 0 : 1);
} Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Using setInputPath instead of addInputPath causes errors.
Using addOutputPath which does not exist.
✗ Incorrect
FileInputFormat uses addInputPath to add input, FileOutputFormat uses setOutputPath for output.
5fill in blank
hardFill all three blanks to complete the mapper class declaration and output types.
Hadoop
public static class TokenizerMapper extends Mapper<[1], [2], [3], IntWritable> { private final static IntWritable one = new IntWritable(1); private Text word = new Text(); public void map([1] key, [2] value, Context context) throws IOException, InterruptedException { StringTokenizer itr = new StringTokenizer(value.toString()); while (itr.hasMoreTokens()) { word.set(itr.nextToken()); context.write(word, one); } } }
Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Using LongWritable for input key but not matching method signature.
Using IntWritable for output key which is incorrect.
✗ Incorrect
The Mapper input key is Object, input value is Text, output key is Text.