0
0
KafkaConceptBeginner · 4 min read

What is Kafka Streams: Simple Explanation and Example

Kafka Streams is a Java library for building applications that process and analyze data in real-time from Apache Kafka topics. It lets you transform, filter, and aggregate data streams easily within your application without needing separate processing clusters.
⚙️

How It Works

Imagine you have a river of data flowing continuously, like water in a stream. Kafka Streams acts like a watermill on that river, processing the data as it flows by. It reads data from Kafka topics, processes it in real-time, and writes the results back to other topics.

It works inside your application as a library, so you don't need to manage separate servers or clusters. It keeps track of where it left off, so if your app stops and restarts, it continues processing without losing data. This makes it reliable and easy to scale.

💻

Example

This example shows a simple Kafka Streams application that reads text lines from one topic, converts them to uppercase, and writes them to another topic.

java
import org.apache.kafka.common.serialization.Serdes;
import org.apache.kafka.streams.KafkaStreams;
import org.apache.kafka.streams.StreamsBuilder;
import org.apache.kafka.streams.StreamsConfig;
import org.apache.kafka.streams.kstream.KStream;

import java.util.Properties;

public class UppercaseStream {
    public static void main(String[] args) {
        Properties props = new Properties();
        props.put(StreamsConfig.APPLICATION_ID_CONFIG, "uppercase-app");
        props.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
        props.put(StreamsConfig.DEFAULT_KEY_SERDE_CLASS_CONFIG, Serdes.String().getClass().getName());
        props.put(StreamsConfig.DEFAULT_VALUE_SERDE_CLASS_CONFIG, Serdes.String().getClass().getName());

        StreamsBuilder builder = new StreamsBuilder();
        KStream<String, String> input = builder.stream("input-topic");

        KStream<String, String> uppercased = input.mapValues(value -> value.toUpperCase());

        uppercased.to("output-topic");

        KafkaStreams streams = new KafkaStreams(builder.build(), props);
        streams.start();

        Runtime.getRuntime().addShutdownHook(new Thread(streams::close));
    }
}
Output
When running, any message sent to 'input-topic' will appear in 'output-topic' with all letters uppercase.
🎯

When to Use

Use Kafka Streams when you want to process data in real-time as it flows through Kafka. It is great for tasks like filtering events, transforming data formats, counting occurrences, or joining streams.

Real-world examples include monitoring user activity on websites, detecting fraud in financial transactions, or aggregating sensor data from IoT devices. It is ideal when you want to embed stream processing directly into your Java applications without managing extra infrastructure.

Key Points

  • Kafka Streams is a lightweight Java library for stream processing.
  • It processes data directly from Kafka topics in real-time.
  • No need for separate processing clusters or servers.
  • Supports transformations, filtering, aggregations, and joins.
  • Automatically handles fault tolerance and state management.

Key Takeaways

Kafka Streams lets you process and transform Kafka data streams inside your Java app.
It is reliable, fault-tolerant, and requires no extra infrastructure.
Use it for real-time data filtering, aggregation, and transformation tasks.
It integrates easily with existing Kafka setups and scales with your app.
Kafka Streams handles state and progress tracking automatically.