0
0
Hadoopdata~5 mins

Flume for log collection in Hadoop - Cheat Sheet & Quick Revision

Choose your learning style9 modes available
Recall & Review
beginner
What is Apache Flume used for?
Apache Flume is used to collect, aggregate, and move large amounts of log data from many sources to a centralized data store like HDFS.
Click to reveal answer
beginner
Name the three main components of a Flume agent.
The three main components are Source (collects data), Channel (buffers data), and Sink (delivers data to destination).
Click to reveal answer
intermediate
How does a Flume Channel work?
A Channel temporarily stores events received from the Source until the Sink consumes them, acting like a queue or buffer.
Click to reveal answer
beginner
What is the role of a Flume Sink?
The Sink takes events from the Channel and writes them to the final destination, such as HDFS or HBase.
Click to reveal answer
intermediate
Why is Flume suitable for log collection in big data environments?
Because it can handle high volumes of streaming data reliably and efficiently, with fault tolerance and scalability.
Click to reveal answer
Which Flume component is responsible for receiving data from a source?
ASink
BChannel
CAgent
DSource
What does the Flume Channel do?
ASends data to HDFS
BBuffers data between Source and Sink
CCollects data from logs
DProcesses data transformations
Which destination is commonly used as a Flume Sink target?
AKafka
BMySQL
CHDFS
DRedis
What feature of Flume helps it handle failures without losing data?
AChannel durability
BSource buffering
CSink retries
DAgent clustering
Which of these is NOT a Flume component?
ABroker
BChannel
CSink
DSource
Explain how Apache Flume collects and moves log data from source to destination.
Think about the three main parts working together like a pipeline.
You got /4 concepts.
    Describe why Flume is a good choice for handling large-scale log data in Hadoop environments.
    Consider what makes a tool reliable and efficient for big data.
    You got /4 concepts.