Kafka producers have a batch size setting that controls how many records are sent in one request. Why does tuning this batch size help handle production load better?
Think about how sending many small requests affects network and broker load.
Increasing batch size allows the producer to send more records in one request, reducing the total number of requests. This improves throughput and reduces network overhead, helping handle higher production loads efficiently.
Given the following Python code snippet using Kafka's consumer API, what will be printed?
from kafka import KafkaConsumer consumer = KafkaConsumer('topic1', group_id='group1', bootstrap_servers=['localhost:9092']) partitions = consumer.partitions_for_topic('topic1') print(len(partitions))
Check what partitions_for_topic returns when the topic exists.
The partitions_for_topic method returns a set of partition numbers for the topic. Taking the length gives the number of partitions. If the topic exists, this will be a positive integer.
Consider this Kafka producer code snippet:
producer = KafkaProducer(bootstrap_servers='localhost:9092')
for i in range(1000):
producer.send('topic1', value=b'message')
producer.flush()Why does this cause high latency when sending many messages?
Think about what flush() does and how it affects sending speed.
Calling flush() after every message forces the producer to wait until the message is sent before continuing. This synchronous behavior reduces throughput and increases latency under load.
Identify the option that contains a syntax error in Kafka consumer configuration in Python.
Look carefully at the commas separating arguments.
Option D is missing a comma between group_id='group1' and bootstrap_servers, causing a syntax error.
Kafka topics have a retention policy that controls how long messages are kept. How does tuning this retention policy help manage production load effectively?
Think about how disk space and message storage affect broker performance.
Reducing retention time limits how long messages stay on brokers, freeing disk space and preventing overload when message volume is high. This helps maintain stable performance under production load.