0
0
KafkaConceptBeginner · 3 min read

What is Avro in Kafka: Simple Explanation and Usage

Avro in Kafka is a data serialization format that helps encode messages in a compact, fast, and schema-based way. It ensures that producers and consumers agree on the data structure, making data exchange reliable and efficient.
⚙️

How It Works

Avro works like a shared language between the sender and receiver of data in Kafka. Imagine sending a letter where both people agree on the format of the letter beforehand. Avro uses a schema that defines the structure of the data, like a form everyone fills out the same way.

When a Kafka producer sends a message, Avro converts the data into a compact binary format using this schema. The Kafka consumer then uses the same schema to decode the message back into readable data. This process avoids confusion and errors, even if the data structure changes over time.

💻

Example

This example shows how to serialize and deserialize a simple Avro message in Kafka using Java.

java
import org.apache.avro.Schema;
import org.apache.avro.generic.GenericData;
import org.apache.avro.generic.GenericRecord;
import org.apache.avro.io.DatumReader;
import org.apache.avro.io.DatumWriter;
import org.apache.avro.io.Decoder;
import org.apache.avro.io.DecoderFactory;
import org.apache.avro.io.Encoder;
import org.apache.avro.io.EncoderFactory;
import org.apache.avro.generic.GenericDatumReader;
import org.apache.avro.generic.GenericDatumWriter;

import java.io.ByteArrayOutputStream;
import java.io.IOException;

public class AvroExample {
    public static void main(String[] args) throws IOException {
        String userSchema = "{\"type\": \"record\", \"name\": \"User\", \"fields\": ["
                + "{\"name\": \"name\", \"type\": \"string\"},"
                + "{\"name\": \"age\", \"type\": \"int\"}]}";

        Schema schema = new Schema.Parser().parse(userSchema);

        GenericRecord user = new GenericData.Record(schema);
        user.put("name", "Alice");
        user.put("age", 30);

        // Serialize
        ByteArrayOutputStream out = new ByteArrayOutputStream();
        DatumWriter<GenericRecord> writer = new GenericDatumWriter<>(schema);
        Encoder encoder = EncoderFactory.get().binaryEncoder(out, null);
        writer.write(user, encoder);
        encoder.flush();
        out.close();

        byte[] serializedBytes = out.toByteArray();

        // Deserialize
        DatumReader<GenericRecord> reader = new GenericDatumReader<>(schema);
        Decoder decoder = DecoderFactory.get().binaryDecoder(serializedBytes, null);
        GenericRecord result = reader.read(null, decoder);

        System.out.println("Deserialized user: " + result);
    }
}
Output
Deserialized user: {name=Alice, age=30}
🎯

When to Use

Use Avro in Kafka when you want to send data between systems that need to agree on the data format. It is great for:

  • Ensuring data consistency with schemas
  • Reducing message size for faster transmission
  • Handling evolving data structures without breaking consumers

For example, in a company where multiple services share user data, Avro helps keep the data format clear and consistent. It also works well with Kafka Schema Registry, which manages schemas centrally.

Key Points

  • Avro uses schemas to define data structure clearly.
  • It serializes data into a compact binary format for efficiency.
  • Kafka producers and consumers use Avro to avoid data format mismatches.
  • Schema Registry helps manage and evolve schemas safely.

Key Takeaways

Avro provides a schema-based, compact way to serialize Kafka messages.
It ensures producers and consumers agree on data structure to avoid errors.
Use Avro when you need efficient, consistent, and evolving data formats in Kafka.
Schema Registry complements Avro by managing schemas centrally.