0
0
KafkaConceptBeginner · 3 min read

What is Schema Registry in Kafka: Explanation and Example

A Schema Registry in Kafka is a service that stores and manages data schemas for Kafka messages. It ensures that producers and consumers agree on the data format, preventing errors and enabling smooth data evolution.
⚙️

How It Works

Think of the Schema Registry as a librarian who keeps track of the exact format of every book (data message) in a library (Kafka). When a producer wants to send a message, it checks with the librarian to see if the format (schema) is known and valid. If it is, the message is sent with a reference to that format.

On the other side, consumers ask the librarian for the format before reading the message. This way, both sides understand the data the same way, even if the format changes over time. The registry also helps manage different versions of schemas, so updates don’t break existing data flows.

💻

Example

This example shows how to register a schema and produce a message using Avro with Kafka and Schema Registry.
python
from confluent_kafka import avro
from confluent_kafka.avro import AvroProducer

# Define the schema
value_schema_str = '''{
  "namespace": "example.avro",
  "type": "record",
  "name": "User",
  "fields": [
    {"name": "name", "type": "string"},
    {"name": "age", "type": "int"}
  ]
}'''

value_schema = avro.loads(value_schema_str)

# Kafka and Schema Registry configuration
config = {
    'bootstrap.servers': 'localhost:9092',
    'schema.registry.url': 'http://localhost:8081'
}

# Create AvroProducer
producer = AvroProducer(config, default_value_schema=value_schema)

# Produce a message
producer.produce(topic='users', value={'name': 'Alice', 'age': 30})
producer.flush()
Output
Message produced to topic 'users' with schema registered in Schema Registry.
🎯

When to Use

Use Schema Registry when you want to ensure that all Kafka messages follow a consistent format. It is especially useful when multiple applications produce or consume data, and you want to avoid errors caused by mismatched data formats.

It also helps when you need to evolve your data format over time without breaking existing consumers. For example, adding new fields or changing data types can be managed safely with schema versions.

Key Points

  • Centralized schema management: One place to store and validate schemas.
  • Compatibility checks: Prevents incompatible schema changes.
  • Supports schema evolution: Allows safe updates to data formats.
  • Works with Avro, JSON Schema, Protobuf: Supports multiple schema types.

Key Takeaways

Schema Registry ensures producers and consumers agree on data format in Kafka.
It stores schemas centrally and manages schema versions to avoid errors.
Use it to safely evolve data formats without breaking existing consumers.
It supports multiple schema types like Avro, JSON Schema, and Protobuf.