Hadoopdata~10 mins

Word count as MapReduce example in Hadoop - Step-by-Step Execution

Choose your learning style9 modes available

Learn Why Deep Visual Try Challenge Project Recall Time

Concept Flow - Word count as MapReduce example

Input Text

↓

Map Function

↓

Emit (word, 1)

↓

Shuffle & Sort

↓

Reduce Function

↓

Sum counts per word

↓

Output (word, total count)

The input text is split and processed by the Map function to emit word counts. Then, the framework groups counts by word and the Reduce function sums them up to produce final counts.

Execution Sample

Hadoop

map(String key, String value):
  for word in value.split():
    emit(word, 1)

reduce(String word, Iterator counts):
  sum = 0
  for c in counts:
    sum += c
  emit(word, sum)

This code counts how many times each word appears in the input text using Map and Reduce functions.

Execution Table

Step	Input	Action	Output
1	"hello world hello"	Map splits text and emits (word,1)	(hello,1), (world,1), (hello,1)
2	(hello,1), (world,1), (hello,1)	Shuffle groups by word	(hello: [1,1]), (world: [1])
3	(hello: [1,1])	Reduce sums counts	(hello, 2)
4	(world: [1])	Reduce sums counts	(world, 1)
5	All words processed	Output final counts	(hello, 2), (world, 1)

💡 All words processed and counts summed, MapReduce job completes.

Variable Tracker

Variable	Start	After Step 1	After Step 2	After Step 3	Final
map_output	empty	(hello,1),(world,1),(hello,1)	(hello,1),(world,1),(hello,1)	(hello,1),(world,1),(hello,1)	N/A
shuffle_output	empty	N/A	(hello:[1,1]),(world:[1])	(hello:[1,1]),(world:[1])	N/A
reduce_output	empty	N/A	N/A	(hello,2)	(hello,2),(world,1)

Key Moments - 3 Insights

Why does the Map function emit (word, 1) instead of just the word?

What happens during the Shuffle & Sort phase?

How does the Reduce function calculate the total count?

Visual Quiz - 3 Questions

Test your understanding

Look at the execution table, what is the output of the Map function at step 1?

A(hello,1), (world,1), (hello,1)

B(hello,2), (world,1)

C(hello:[1,1]), (world:[1])

D(hello,0), (world,0)

Concept Snapshot

MapReduce Word Count:
- Map splits text, emits (word,1) pairs
- Shuffle groups pairs by word
- Reduce sums counts per word
- Output is (word, total count)
- Used for counting words in large data sets

Full Transcript

This example shows how MapReduce counts words in text. The Map function reads input text and emits each word with a count of 1. Then, the framework groups all counts by word in the Shuffle phase. The Reduce function sums these counts to get total occurrences per word. Finally, the output lists each word with its total count. This process allows counting words efficiently in big data by splitting work across many machines.