Challenge - 5 Problems

🎖️

Flume Log Collection Master

Get all challenges correct to earn this badge!

Test your skills under time pressure!

🧠 Conceptual

intermediate

2:00remaining

What is the primary role of a Flume agent in log collection?

In Apache Flume, what does an agent mainly do when collecting logs?

AIt converts logs into SQL queries for database insertion.

BIt collects data from sources, processes it, and sends it to sinks.

CIt only monitors the health of the Hadoop cluster.

DIt stores logs permanently in HDFS without processing.

Attempts:

2 left

❓ query_result

intermediate

2:00remaining

What output does this Flume configuration produce?

Given this Flume configuration snippet, what is the expected output destination for the logs?

Hadoop

agent.sources = source1
agent.channels = channel1
agent.sinks = sink1

agent.sources.source1.type = exec
agent.sources.source1.command = tail -F /var/log/syslog

agent.channels.channel1.type = memory
agent.channels.channel1.capacity = 1000

agent.sinks.sink1.type = hdfs
agent.sinks.sink1.hdfs.path = hdfs://namenode/flume/logs/
agent.sinks.sink1.hdfs.filePrefix = syslog-

agent.sources.source1.channels = channel1
agent.sinks.sink1.channel = channel1

ALogs are sent to a Kafka topic named 'syslog'.

BLogs are collected from /var/log/syslog and stored locally on the agent machine.

CLogs are continuously collected from /var/log/syslog and stored in HDFS under /flume/logs/ with prefix 'syslog-'.

DLogs are discarded after being read from /var/log/syslog.

Attempts:

2 left

📝 Syntax

advanced

2:00remaining

Identify the syntax error in this Flume configuration snippet

Which option contains a syntax error that will prevent the Flume agent from starting?

Hadoop

agent.sources = source1
agent.channels = channel1
agent.sinks = sink1

agent.sources.source1.type = exec
agent.sources.source1.command = tail -F /var/log/syslog

agent.channels.channel1.type = memory
agent.channels.channel1.capacity = 1000

agent.sinks.sink1.type = hdfs
agent.sinks.sink1.hdfs.path = hdfs://namenode/flume/logs/
agent.sinks.sink1.hdfs.filePrefix = syslog-

agent.sources.source1.channels = channel1
agent.sinks.sink1.channel = channel1

Aagent.sinks.sink1.channel = channel1

Bagent.sources.source1.channels = channel1

Cagent.channels.channel1.capacity = 1000

Dagent.sinks.sink1.channels = channel1

Attempts:

2 left

❓ optimization

advanced

2:00remaining

How to optimize Flume for high log throughput?

You want to improve Flume's performance to handle a large volume of logs with minimal delay. Which option is the best optimization?

AUse a memory channel instead of a file channel to reduce disk I/O latency.

BDisable batch processing to send each event immediately.

CSet the source type to 'avro' to compress logs before sending.

DIncrease the number of sinks but keep a single channel to avoid complexity.

Attempts:

2 left

🔧 Debug

expert

3:00remaining

Why does this Flume agent fail to deliver logs to HDFS?

Given this Flume agent configuration, logs are not appearing in HDFS. What is the most likely cause?

Hadoop

agent.sources = source1
agent.channels = channel1
agent.sinks = sink1

agent.sources.source1.type = exec
agent.sources.source1.command = tail -F /var/log/syslog

agent.channels.channel1.type = file
agent.channels.channel1.checkpointDir = /tmp/flume/checkpoint
agent.channels.channel1.dataDirs = /tmp/flume/data

agent.sinks.sink1.type = hdfs
agent.sinks.sink1.hdfs.path = hdfs://namenode/flume/logs/
agent.sinks.sink1.hdfs.filePrefix = syslog-

agent.sources.source1.channels = channel1
agent.sinks.sink1.channel = channel1

AThe file channel directories (/tmp/flume/checkpoint and /tmp/flume/data) do not have proper write permissions.

BThe source command 'tail -F /var/log/syslog' is incorrect and does not produce output.

CThe sink type 'hdfs' is deprecated and should be replaced with 'hdfs_sink'.

DThe channel type 'file' is incompatible with the exec source.

Attempts:

2 left