Consider a Kafka Streams application that reads a stream of user clicks and transforms the data to count clicks per user.
streamsBuilder.stream("clicks")
.map((key, value) -> KeyValue.pair(value.getUserId(), 1))
.groupByKey()
.count()
.toStream()
.foreach((userId, count) -> System.out.println(userId + ": " + count));If the input stream has these records:
- key=null, value={userId: "alice"}
- key=null, value={userId: "bob"}
- key=null, value={userId: "alice"}
What will be printed?
streamsBuilder.stream("clicks") .map((key, value) -> KeyValue.pair(value.getUserId(), 1)) .groupByKey() .count() .toStream() .foreach((userId, count) -> System.out.println(userId + ": " + count));
Think about how the map step changes the key to the userId, enabling grouping by user.
The map step changes each record's key to the userId, so grouping by key counts clicks per user. Alice appears twice, so her count is 2; Bob appears once, so his count is 1.
Why is it important for stream processing systems like Kafka Streams to transform data as it flows through the pipeline?
Think about what real-time data processing aims to achieve.
Stream processing transforms data to enrich, filter, or aggregate it immediately, enabling quick decisions and insights without waiting for batch jobs.
Look at this Kafka Streams code snippet:
streamsBuilder.stream("orders")
.filter((key, value) -> value.getAmount() > 100)
.mapValues(value -> value.getCustomerName())
.to("high-value-customers");What error will this code cause when run?
Consider what happens to the key after mapValues and what to expects.
The mapValues changes only the value, but the key remains the same. If the original key is null or not serialized properly, to will cause a runtime serialization error.
Given a Kafka Streams input stream of strings, which code snippet correctly transforms all values to uppercase?
Remember the correct Java method name for uppercase conversion.
The correct method is toUpperCase(). Option A uses mapValues which transforms only the value, keeping the key unchanged.
Imagine a bank uses Kafka Streams to detect fraudulent transactions in real-time. Which transformation best supports this use case?
Think about how to spot suspicious activity quickly using streaming data.
Filtering high-value transactions and aggregating counts per account in short time windows helps detect unusual patterns immediately, enabling fast fraud alerts.