Practice - 5 Tasks

Answer the questions below

1fill in blank

easy

Complete the code to read data from HDFS using Hadoop streaming.

Hadoop

hadoop fs -cat [1]/input.txt

Drag options to blanks, or click blank then click option'

A/tmp

B/data

C/user/hadoop

D/input

Attempts:

3 left

2fill in blank

medium

Complete the code to define a mapper function in Hadoop streaming that converts input text to lowercase.

Hadoop

"#!/bin/bash
while read line; do
  echo [1]
done"

Drag options to blanks, or click blank then click option'

Aecho $line

Becho $line | tr '[:lower:]' '[:upper:]'

Cecho $line | tr '[:upper:]' '[:lower:]'

Decho $line | rev

Attempts:

3 left

3fill in blank

hard

Fix the error in the reducer code that sums counts from the mapper output.

Hadoop

"#!/usr/bin/env python3
import sys

current_word = None
current_count = 0

for line in sys.stdin:
    word, count = line.strip().split('\t')
    count = int(count)
    if word == [1]:
        current_count += count
    else:
        if current_word:
            print(f"{current_word}\t{current_count}")
        current_word = word
        current_count = count

if current_word == [1]:
    print(f"{current_word}\t{current_count}")
"

Drag options to blanks, or click blank then click option'

Acurrent_word

Bword

Ccount

Dline

Attempts:

3 left

4fill in blank

hard

Fill both blanks to create a batch layer job that reads from HDFS and writes output to a directory.

Hadoop

hadoop jar /usr/lib/hadoop-mapreduce/hadoop-streaming.jar \
  -input [1] \
  -output [2] \
  -mapper mapper.sh \
  -reducer reducer.py

Drag options to blanks, or click blank then click option'

A/data/batch_input

B/user/hadoop/batch_output

C/tmp/input

D/output

Attempts:

3 left

5fill in blank

hard

Fill all three blanks to create a streaming layer command that reads from Kafka and writes to HDFS.

Hadoop

kafka-console-consumer.sh --bootstrap-server [1] --topic [2] --from-beginning | \
  hadoop fs -put - [3]/streaming_output/data.txt

Drag options to blanks, or click blank then click option'

Alocalhost:9092

Buser_events

C/user/hadoop

Dlocalhost:2181

Attempts:

3 left