Recall & Review
beginner
What is Apache Flume used for?
Apache Flume is used to collect, aggregate, and move large amounts of log data from many sources to a centralized data store like HDFS.
Click to reveal answer
beginner
Name the three main components of a Flume agent.
The three main components are Source (collects data), Channel (buffers data), and Sink (delivers data to destination).
Click to reveal answer
intermediate
How does a Flume Channel work?
A Channel temporarily stores events received from the Source until the Sink consumes them, acting like a queue or buffer.
Click to reveal answer
beginner
What is the role of a Flume Sink?
The Sink takes events from the Channel and writes them to the final destination, such as HDFS or HBase.
Click to reveal answer
intermediate
Why is Flume suitable for log collection in big data environments?
Because it can handle high volumes of streaming data reliably and efficiently, with fault tolerance and scalability.
Click to reveal answer
Which Flume component is responsible for receiving data from a source?
✗ Incorrect
The Source collects data from external sources and sends it to the Channel.
What does the Flume Channel do?
✗ Incorrect
The Channel acts as a temporary storage buffer between Source and Sink.
Which destination is commonly used as a Flume Sink target?
✗ Incorrect
HDFS is a common destination for Flume to store collected log data.
What feature of Flume helps it handle failures without losing data?
✗ Incorrect
Channels can be durable to ensure data is not lost during failures.
Which of these is NOT a Flume component?
✗ Incorrect
Broker is not a Flume component; Flume uses Source, Channel, and Sink.
Explain how Apache Flume collects and moves log data from source to destination.
Think about the three main parts working together like a pipeline.
You got /4 concepts.
Describe why Flume is a good choice for handling large-scale log data in Hadoop environments.
Consider what makes a tool reliable and efficient for big data.
You got /4 concepts.