0
0
Kafkadevops~7 mins

Transform and converter chains in Kafka - Commands & Configuration

Choose your learning style9 modes available
Introduction
When data moves through Kafka Connect, it often needs changes to fit the target system. Transform and converter chains let you change data format and content step-by-step, like a factory line, so the data is ready to use.
When you want to change the format of messages from JSON to Avro before sending to a database.
When you need to add or remove fields from data as it moves through Kafka Connect.
When you want to convert data encoding from string to bytes for a specific sink system.
When you want to apply multiple small changes to data in order, like trimming spaces then changing case.
When you want to ensure data is compatible with the target system by applying converters and transforms in sequence.
Config File - connector-with-transform.yaml
connector-with-transform.yaml
name: example-connector
connector.class: FileStreamSink
tasks.max: 1
topic: example-topic
file: /tmp/output.txt
key.converter: org.apache.kafka.connect.storage.StringConverter
value.converter: org.apache.kafka.connect.json.JsonConverter
value.converter.schemas.enable: false
transforms: trim,upperCase
transforms.trim.type: org.apache.kafka.connect.transforms.ReplaceField$Value
transforms.trim.whitelist: message
transforms.upperCase.type: org.apache.kafka.connect.transforms.ValueToKey
transforms.upperCase.fields: message

This YAML configures a Kafka Connect sink connector that writes data to a file.

key.converter and value.converter set how keys and values are converted between Kafka and Connect.

transforms defines a chain of transformations applied to the data before writing.

Each transform is named and configured to modify data step-by-step.

Commands
Start a producer to send messages to the example-topic in Kafka.
Terminal
kafka-console-producer --broker-list localhost:9092 --topic example-topic
Expected OutputExpected
No output (command runs silently)
--broker-list - Specifies Kafka broker address
--topic - Specifies the topic to send messages to
Start Kafka Connect in standalone mode with the connector config that applies transform and converter chains.
Terminal
kafka-connect-standalone connect-standalone.properties connector-with-transform.yaml
Expected OutputExpected
[2024-06-01 12:00:00,000] INFO Kafka Connect started (org.apache.kafka.connect.cli.ConnectStandalone) [2024-06-01 12:00:01,000] INFO Connector example-connector started (org.apache.kafka.connect.runtime.WorkerConnector)
Check the output file to see the transformed data after passing through converter and transform chains.
Terminal
cat /tmp/output.txt
Expected OutputExpected
{"message":"HELLO WORLD"}
Key Concept

If you remember nothing else from this pattern, remember: transform and converter chains let you change data format and content step-by-step in Kafka Connect to fit your target system.

Common Mistakes
Not enabling schemas when using JsonConverter causes data format errors.
Without schemas enabled, the converter may not correctly interpret data structure.
Set value.converter.schemas.enable to true when using JsonConverter if your data has schemas.
Applying transforms in the wrong order leads to unexpected data results.
Transforms run in sequence; wrong order can break data or cause errors.
Plan and order transforms carefully to match the desired data changes.
Using incompatible converters for key and value causes connector failures.
Key and value converters must match the data format expected by source and sink.
Choose converters that match your data format and target system requirements.
Summary
Configure key and value converters to handle data format conversion.
Define a chain of transforms to modify data step-by-step before sending to the sink.
Run Kafka Connect with the connector config and verify transformed data output.