What is Source Connector in Kafka: Simple Explanation and Example
source connector in Kafka is a component that imports data from an external system into Kafka topics. It acts like a bridge that continuously reads data from databases, files, or other sources and sends it into Kafka for processing.How It Works
Think of a source connector as a smart pipeline that takes data from outside Kafka and pushes it inside. Imagine a water pump that draws water from a river (external system) and fills a tank (Kafka topic). The source connector keeps checking the river for new water and keeps the tank filled.
Technically, it connects to systems like databases, message queues, or files, reads new or changed data, and writes that data into Kafka topics in real time or near real time. This way, Kafka can process or distribute fresh data without manual copying.
Example
This example shows a simple source connector configuration that reads data from a MySQL database and sends it to a Kafka topic.
{
"name": "mysql-source-connector",
"config": {
"connector.class": "io.confluent.connect.jdbc.JdbcSourceConnector",
"connection.url": "jdbc:mysql://localhost:3306/mydb",
"connection.user": "user",
"connection.password": "password",
"table.whitelist": "customers",
"mode": "incrementing",
"incrementing.column.name": "id",
"topic.prefix": "mysql-",
"poll.interval.ms": "5000"
}
}When to Use
Use a source connector when you want to stream data from external systems into Kafka automatically. For example:
- Syncing database changes into Kafka for real-time analytics.
- Importing log files or event data from external apps.
- Integrating legacy systems with modern Kafka-based pipelines.
This helps keep data fresh and reduces manual data movement.
Key Points
- A source connector imports data from outside systems into Kafka topics.
- It runs continuously or on a schedule to keep data updated.
- Common sources include databases, files, and message queues.
- It simplifies data integration and streaming pipelines.