Kafkadevops~7 mins

Common connectors (JDBC, S3, Elasticsearch) in Kafka - Commands & Configuration

Choose your learning style9 modes available

Learn Why Deep Visual Try Challenge Project Recall Time

Introduction

Kafka connectors help move data between Kafka and other systems like databases, storage, or search engines automatically. They solve the problem of manually writing code to transfer data by providing ready-made tools to connect Kafka with systems like JDBC databases, Amazon S3 storage, and Elasticsearch search engines.

When you want to stream data from a database into Kafka without writing custom code.

When you need to save Kafka topic data as files in Amazon S3 for backup or analytics.

When you want to index Kafka data into Elasticsearch for fast search and analysis.

When you want to keep your data systems synchronized automatically.

When you want to simplify data integration between Kafka and other platforms.

Config File - jdbc-source-connector.properties

jdbc-source-connector.properties

name=jdbc-source-connector
connector.class=io.confluent.connect.jdbc.JdbcSourceConnector
connection.url=jdbc:postgresql://localhost:5432/mydb
connection.user=myuser
connection.password=mypassword
topic.prefix=jdbc-
poll.interval.ms=10000
mode=incrementing
incrementing.column.name=id

This configuration file sets up a Kafka JDBC source connector.

name: unique name for the connector.

connector.class: specifies the JDBC source connector class.

connection.url, connection.user, connection.password: database connection details.

topic.prefix: prefix added to Kafka topics created from database tables.

poll.interval.ms: how often to check the database for new data.

mode and incrementing.column.name: how to detect new rows.

Commands

Starts the Kafka Connect standalone worker with the JDBC source connector configuration to begin streaming data from the database into Kafka topics.

Terminal

connect-standalone /etc/kafka/connect-standalone.properties jdbc-source-connector.properties

Expected OutputExpected

[2024-06-01 12:00:00,000] INFO Kafka Connect started (org.apache.kafka.connect.cli.ConnectStandalone) [2024-06-01 12:00:01,000] INFO Starting connector jdbc-source-connector (org.apache.kafka.connect.runtime.WorkerConnector) [2024-06-01 12:00:02,000] INFO Connector jdbc-source-connector started (org.apache.kafka.connect.runtime.WorkerConnector)

Lists all Kafka topics to verify that topics with the prefix 'jdbc-' have been created by the JDBC source connector.

Terminal

kafka-topics --bootstrap-server localhost:9092 --list

Expected OutputExpected

jdbc-mytable jdbc-anothertable

→

--bootstrap-server - Specifies the Kafka server to connect to.

Starts the Kafka Connect standalone worker with an S3 sink connector configuration to save Kafka topic data into Amazon S3 storage.

Terminal

connect-standalone /etc/kafka/connect-standalone.properties s3-sink-connector.properties

Expected OutputExpected

[2024-06-01 12:05:00,000] INFO Kafka Connect started (org.apache.kafka.connect.cli.ConnectStandalone) [2024-06-01 12:05:01,000] INFO Starting connector s3-sink-connector (org.apache.kafka.connect.runtime.WorkerConnector) [2024-06-01 12:05:02,000] INFO Connector s3-sink-connector started (org.apache.kafka.connect.runtime.WorkerConnector)

Starts the Kafka Connect standalone worker with an Elasticsearch sink connector configuration to index Kafka topic data into Elasticsearch for search and analytics.

Terminal

connect-standalone /etc/kafka/connect-standalone.properties elasticsearch-sink-connector.properties

Expected OutputExpected

[2024-06-01 12:10:00,000] INFO Kafka Connect started (org.apache.kafka.connect.cli.ConnectStandalone) [2024-06-01 12:10:01,000] INFO Starting connector elasticsearch-sink-connector (org.apache.kafka.connect.runtime.WorkerConnector) [2024-06-01 12:10:02,000] INFO Connector elasticsearch-sink-connector started (org.apache.kafka.connect.runtime.WorkerConnector)

Key Concept

If you remember nothing else, remember: Kafka connectors automate moving data between Kafka and other systems without writing custom code.

Common Mistakes

Not setting the correct database connection URL or credentials in the JDBC connector config.

The connector cannot connect to the database, so no data is streamed.

Double-check and test the database URL, username, and password before starting the connector.

Forgetting to start Kafka Connect with the connector configuration file.

Kafka Connect won't load the connector, so no data movement happens.

Always run the connect-standalone or connect-distributed command with the correct connector config file.

Using incorrect topic names or prefixes when verifying topics.

You may think the connector failed when topics exist but under a different name.

Check the topic.prefix setting and list topics carefully.

Summary

Kafka connectors simplify data integration by automating data flow between Kafka and systems like databases, S3, and Elasticsearch.

You configure connectors with properties files specifying connection details and behavior.

You start connectors using Kafka Connect commands and verify topics or data destinations to confirm operation.