beginner

What is Apache Flume used for?

Apache Flume is used to collect, aggregate, and move large amounts of log data from many sources to a centralized data store like HDFS.

Click to reveal answer

beginner

Name the three main components of a Flume agent.

The three main components are Source (collects data), Channel (buffers data), and Sink (delivers data to destination).

Click to reveal answer

intermediate

How does a Flume Channel work?

A Channel temporarily stores events received from the Source until the Sink consumes them, acting like a queue or buffer.

Click to reveal answer

beginner

What is the role of a Flume Sink?

The Sink takes events from the Channel and writes them to the final destination, such as HDFS or HBase.

Click to reveal answer

intermediate

Why is Flume suitable for log collection in big data environments?

Because it can handle high volumes of streaming data reliably and efficiently, with fault tolerance and scalability.

Click to reveal answer

Which Flume component is responsible for receiving data from a source?

ASink

BChannel

CAgent

DSource

What does the Flume Channel do?

ASends data to HDFS

BBuffers data between Source and Sink

CCollects data from logs

DProcesses data transformations

Which destination is commonly used as a Flume Sink target?

AKafka

BMySQL

CHDFS

DRedis

What feature of Flume helps it handle failures without losing data?

AChannel durability

BSource buffering

CSink retries

DAgent clustering

Which of these is NOT a Flume component?

ABroker

BChannel

CSink

DSource

Explain how Apache Flume collects and moves log data from source to destination.

Describe why Flume is a good choice for handling large-scale log data in Hadoop environments.