0
0
Kafkadevops~5 mins

Avro schema definition in Kafka - Commands & Configuration

Choose your learning style9 modes available
Introduction
When sending data between systems, you need a clear format so both sides understand it. Avro schema defines the structure of your data in a simple way, making sure everyone reads it the same way.
When you want to send messages in Kafka with a fixed structure so consumers can read them correctly.
When you need to evolve your data format over time without breaking old consumers.
When you want to validate data before sending it to avoid errors downstream.
When you want to compress data efficiently while keeping the schema.
When you want to share data formats between different programming languages easily.
Config File - user.avsc
user.avsc
{
  "type": "record",
  "name": "User",
  "namespace": "com.example",
  "fields": [
    {"name": "id", "type": "int"},
    {"name": "name", "type": "string"},
    {"name": "email", "type": ["null", "string"], "default": null}
  ]
}

This file defines an Avro schema for a User record.

type: The data type, here a record (like an object).

name: The name of the record.

namespace: A way to group schemas to avoid name clashes.

fields: The list of data fields with their names and types.

The email field can be null or a string, with null as default.

Commands
This command compiles the Avro schema file into Java classes in the current directory, so you can use the schema in your code.
Terminal
avro-tools compile schema user.avsc .
Expected OutputExpected
No output (command runs silently)
This command starts a Kafka producer that uses the Avro schema to send messages to the 'users' topic, ensuring messages follow the schema.
Terminal
kafka-avro-console-producer --broker-list localhost:9092 --topic users --property value.schema.file=user.avsc
Expected OutputExpected
>
--broker-list - Specifies Kafka broker addresses
--topic - Specifies the Kafka topic to send messages to
--property value.schema.file - Points to the Avro schema file for message validation
This command starts a Kafka consumer that reads messages from the 'users' topic from the beginning, printing keys and values decoded using the Avro schema.
Terminal
kafka-avro-console-consumer --bootstrap-server localhost:9092 --topic users --from-beginning --property print.key=true --property print.value=true
Expected OutputExpected
null {"id":1,"name":"Alice","email":"alice@example.com"} null {"id":2,"name":"Bob","email":null}
--bootstrap-server - Specifies Kafka broker addresses
--topic - Specifies the Kafka topic to read messages from
--from-beginning - Reads all messages from the start
--property print.key=true - Prints the message key
--property print.value=true - Prints the message value
Key Concept

If you remember nothing else from this pattern, remember: Avro schema defines a clear, shared structure for your data so producers and consumers agree on the format.

Common Mistakes
Not specifying the schema file when producing messages
Messages won't be validated or serialized correctly, causing errors or unreadable data.
Always use the --property value.schema.file flag with the correct schema file when producing Avro messages.
Changing the schema fields without updating consumers
Consumers may fail to read messages or get wrong data if the schema changes unexpectedly.
Use schema evolution rules and update all consumers to handle new schema versions.
Using incompatible types in the schema fields
Data won't serialize or deserialize properly, causing runtime errors.
Use valid Avro types and unions for optional fields, like ["null", "string"] for nullable strings.
Summary
Define your data structure clearly in an Avro schema file with fields and types.
Use avro-tools or Kafka Avro console producer to compile and send messages with schema validation.
Use Kafka Avro console consumer to read and decode messages using the same schema.