0
0
Kafkadevops~10 mins

Serialization (String, JSON, Avro) in Kafka - Step-by-Step Execution

Choose your learning style9 modes available
Process Flow - Serialization (String, JSON, Avro)
Start: Data in memory
Choose serialization format
Serialize data to bytes
Send bytes to Kafka topic
Consumer reads bytes
Deserialize bytes back to data
Use data in application
End
Data is converted from memory format to bytes using a chosen serialization method, sent through Kafka, then converted back to usable data by the consumer.
Execution Sample
Kafka
producer.send(topic, value=message.encode('utf-8'))  # String
producer.send(topic, value=json.dumps(message_dict).encode('utf-8'))  # JSON
producer.send(topic, value=avro_serializer(message_dict))  # Avro
Shows how to serialize data as String, JSON, and Avro before sending to Kafka.
Process Table
StepData TypeActionSerialized Output (bytes)Kafka Topic Send
1StringEncode string to UTF-8 bytesb'Hello Kafka'Sent to topic
2JSONConvert dict to JSON string, then UTF-8 bytesb'{"key": "value"}'Sent to topic
3AvroSerialize dict using Avro schema to bytesb'\x06value'Sent to topic
4ConsumerRead bytes from topicb'\x06value'Received
5AvroDeserialize bytes back to dict{"key": "value"}Used in app
6EndAll data processed--
💡 All data serialized, sent, received, and deserialized successfully.
Status Tracker
VariableStartAfter String SerializationAfter JSON SerializationAfter Avro SerializationAfter Deserialization
message"Hello Kafka"b'Hello Kafka'---
message_dict{"key": "value"}-b'{"key": "value"}'b'\x06value'{"key": "value"}
Key Moments - 3 Insights
Why do we encode strings to bytes before sending to Kafka?
Kafka sends data as bytes, so strings must be encoded (see execution_table step 1) to convert text into bytes.
How is JSON different from a plain string in serialization?
JSON converts structured data (like dict) to a string format before encoding to bytes (execution_table step 2), unlike plain strings which are encoded directly.
What makes Avro serialization special compared to String and JSON?
Avro uses a schema to serialize data into a compact binary format (step 3), which is more efficient and supports schema evolution.
Visual Quiz - 3 Questions
Test your understanding
Look at the execution_table, what is the serialized output for JSON at step 2?
Ab'{"key": "value"}'
Bb'Hello Kafka'
Cb'\x06value'
D{"key": "value"}
💡 Hint
Check the 'Serialized Output (bytes)' column at step 2 in execution_table.
At which step does the consumer deserialize Avro bytes back to a dictionary?
AStep 3
BStep 5
CStep 1
DStep 2
💡 Hint
Look for 'Deserialize bytes back to dict' in the 'Action' column in execution_table.
If we skip encoding strings to bytes, what will happen when sending to Kafka?
AData sends successfully as string
BData converts automatically to JSON
CKafka throws an error because it expects bytes
DAvro serialization happens instead
💡 Hint
Recall why encoding to bytes is needed from key_moments and execution_table step 1.
Concept Snapshot
Serialization converts data into bytes for Kafka.
String: encode text to UTF-8 bytes.
JSON: convert dict to JSON string, then encode.
Avro: use schema to serialize compact binary.
Consumer deserializes bytes back to data.
Serialization ensures data travels safely and efficiently.
Full Transcript
Serialization is the process of converting data into bytes so Kafka can send it. We can serialize simple strings by encoding them to UTF-8 bytes. For structured data like dictionaries, JSON serialization converts the dict to a JSON string, then encodes it to bytes. Avro serialization uses a schema to convert data into a compact binary format, which is efficient and supports schema changes. The producer sends these bytes to a Kafka topic. The consumer reads the bytes and deserializes them back to the original data format to use in the application. This process ensures data is safely and efficiently transmitted through Kafka.