0
0
Kafkadevops~30 mins

GroupBy and aggregation in Kafka - Mini Project: Build & Apply

Choose your learning style9 modes available
Kafka Stream Processing: GroupBy and Aggregation
📖 Scenario: You are working with a Kafka stream that receives sales data from a store. Each message contains the product name and the quantity sold. You want to group the sales by product and calculate the total quantity sold for each product.
🎯 Goal: Build a Kafka Streams application that groups sales by product and aggregates the total quantity sold per product.
📋 What You'll Learn
Create a Kafka Streams topology with a source topic named sales.
Group the stream by product key.
Aggregate the total quantity sold per product.
Print the aggregated results to the console.
💡 Why This Matters
🌍 Real World
Retail companies use Kafka Streams to analyze live sales data by product to track inventory and demand in real time.
💼 Career
Understanding Kafka Streams grouping and aggregation is essential for roles in data engineering, real-time analytics, and DevOps managing streaming data pipelines.
Progress0 / 4 steps
1
Create the initial Kafka Streams source stream
Create a Kafka Streams KStream named salesStream that reads from the topic "sales" using the builder.stream("sales") method.
Kafka
Need a hint?

Use builder.stream("sales") to create the stream and assign it to salesStream.

2
Parse the sales data and create a key-value pair
Create a new KStream named parsedStream by mapping salesStream to extract the product as key and quantity as integer value. Assume each value is a comma-separated string like "product,quantity". Use map with lambda (key, value) -> new KeyValue<>(product, quantity).
Kafka
Need a hint?

Split the value by comma, parse quantity to Integer, and return new KeyValue with product and quantity.

3
Group by product and aggregate total quantity
Create a KTable<String, Integer> named totalByProduct by grouping parsedStream by key using groupByKey() and aggregating with reduce to sum quantities. Use (aggValue, newValue) -> aggValue + newValue as the reducer.
Kafka
Need a hint?

Use groupByKey() on parsedStream and then reduce to sum the quantities.

4
Print the aggregated results to the console
Use totalByProduct.toStream().foreach to print each product and its total quantity in the format "Product: <product>, Total Quantity: <quantity>". Then start the Kafka Streams application with KafkaStreams streams = new KafkaStreams(builder.build(), props); streams.start();. Assume props is already defined.
Kafka
Need a hint?

Use toStream().foreach on totalByProduct to print results, then create and start KafkaStreams with builder.build() and props.