0
0
Kafkadevops~7 mins

Common connectors (JDBC, S3, Elasticsearch) in Kafka - Commands & Configuration

Choose your learning style9 modes available
Introduction
Kafka connectors help move data between Kafka and other systems like databases, storage, or search engines automatically. They solve the problem of manually writing code to transfer data by providing ready-made tools to connect Kafka with systems like JDBC databases, Amazon S3 storage, and Elasticsearch search engines.
When you want to stream data from a database into Kafka without writing custom code.
When you need to save Kafka topic data as files in Amazon S3 for backup or analytics.
When you want to index Kafka data into Elasticsearch for fast search and analysis.
When you want to keep your data systems synchronized automatically.
When you want to simplify data integration between Kafka and other platforms.
Config File - jdbc-source-connector.properties
jdbc-source-connector.properties
name=jdbc-source-connector
connector.class=io.confluent.connect.jdbc.JdbcSourceConnector
connection.url=jdbc:postgresql://localhost:5432/mydb
connection.user=myuser
connection.password=mypassword
topic.prefix=jdbc-
poll.interval.ms=10000
mode=incrementing
incrementing.column.name=id

This configuration file sets up a Kafka JDBC source connector.

name: unique name for the connector.

connector.class: specifies the JDBC source connector class.

connection.url, connection.user, connection.password: database connection details.

topic.prefix: prefix added to Kafka topics created from database tables.

poll.interval.ms: how often to check the database for new data.

mode and incrementing.column.name: how to detect new rows.

Commands
Starts the Kafka Connect standalone worker with the JDBC source connector configuration to begin streaming data from the database into Kafka topics.
Terminal
connect-standalone /etc/kafka/connect-standalone.properties jdbc-source-connector.properties
Expected OutputExpected
[2024-06-01 12:00:00,000] INFO Kafka Connect started (org.apache.kafka.connect.cli.ConnectStandalone) [2024-06-01 12:00:01,000] INFO Starting connector jdbc-source-connector (org.apache.kafka.connect.runtime.WorkerConnector) [2024-06-01 12:00:02,000] INFO Connector jdbc-source-connector started (org.apache.kafka.connect.runtime.WorkerConnector)
Lists all Kafka topics to verify that topics with the prefix 'jdbc-' have been created by the JDBC source connector.
Terminal
kafka-topics --bootstrap-server localhost:9092 --list
Expected OutputExpected
jdbc-mytable jdbc-anothertable
--bootstrap-server - Specifies the Kafka server to connect to.
Starts the Kafka Connect standalone worker with an S3 sink connector configuration to save Kafka topic data into Amazon S3 storage.
Terminal
connect-standalone /etc/kafka/connect-standalone.properties s3-sink-connector.properties
Expected OutputExpected
[2024-06-01 12:05:00,000] INFO Kafka Connect started (org.apache.kafka.connect.cli.ConnectStandalone) [2024-06-01 12:05:01,000] INFO Starting connector s3-sink-connector (org.apache.kafka.connect.runtime.WorkerConnector) [2024-06-01 12:05:02,000] INFO Connector s3-sink-connector started (org.apache.kafka.connect.runtime.WorkerConnector)
Starts the Kafka Connect standalone worker with an Elasticsearch sink connector configuration to index Kafka topic data into Elasticsearch for search and analytics.
Terminal
connect-standalone /etc/kafka/connect-standalone.properties elasticsearch-sink-connector.properties
Expected OutputExpected
[2024-06-01 12:10:00,000] INFO Kafka Connect started (org.apache.kafka.connect.cli.ConnectStandalone) [2024-06-01 12:10:01,000] INFO Starting connector elasticsearch-sink-connector (org.apache.kafka.connect.runtime.WorkerConnector) [2024-06-01 12:10:02,000] INFO Connector elasticsearch-sink-connector started (org.apache.kafka.connect.runtime.WorkerConnector)
Key Concept

If you remember nothing else, remember: Kafka connectors automate moving data between Kafka and other systems without writing custom code.

Common Mistakes
Not setting the correct database connection URL or credentials in the JDBC connector config.
The connector cannot connect to the database, so no data is streamed.
Double-check and test the database URL, username, and password before starting the connector.
Forgetting to start Kafka Connect with the connector configuration file.
Kafka Connect won't load the connector, so no data movement happens.
Always run the connect-standalone or connect-distributed command with the correct connector config file.
Using incorrect topic names or prefixes when verifying topics.
You may think the connector failed when topics exist but under a different name.
Check the topic.prefix setting and list topics carefully.
Summary
Kafka connectors simplify data integration by automating data flow between Kafka and systems like databases, S3, and Elasticsearch.
You configure connectors with properties files specifying connection details and behavior.
You start connectors using Kafka Connect commands and verify topics or data destinations to confirm operation.