Recall & Review
beginner
What is an input split in Hadoop?
An input split is a chunk of data that Hadoop divides from the input files to process in parallel. Each split is processed by one map task.
Click to reveal answer
beginner
Why is data locality important in Hadoop?
Data locality means running tasks on the same machine or close to where the data is stored. It reduces network traffic and speeds up processing.
Click to reveal answer
intermediate
How does Hadoop decide the size of an input split?
Hadoop uses the HDFS block size as a guide but can adjust split size based on configuration and file format to balance load and efficiency.
Click to reveal answer
intermediate
What happens if a map task runs on a node without the data it needs?
The task will read data over the network from another node, which slows down processing and increases network load.
Click to reveal answer
beginner
Explain the relationship between input splits and map tasks.
Each input split is assigned to one map task. The map task processes the data in that split independently.
Click to reveal answer
What does an input split represent in Hadoop?
✗ Incorrect
An input split is a piece of data assigned to a single map task for processing.
Why is data locality beneficial in Hadoop?
✗ Incorrect
Running tasks near the data avoids slow network transfers, making processing faster.
What guides the default size of an input split?
✗ Incorrect
Hadoop uses the HDFS block size as the default size for input splits.
If a map task runs on a node without the data, what happens?
✗ Incorrect
Without local data, the task must fetch data over the network, which is slower.
How many map tasks are created per input split?
✗ Incorrect
Each input split is processed by exactly one map task.
Describe what input splits are and why they matter in Hadoop processing.
Think about how Hadoop breaks big data into smaller parts.
You got /3 concepts.
Explain data locality and how it affects the speed of Hadoop jobs.
Consider why running tasks near data is faster.
You got /3 concepts.