Why connectors integrate external systems in Kafka - Performance Analysis
When connectors link Kafka to outside systems, we want to know how the work grows as data grows.
How does the time to move data change when more data or systems are involved?
Analyze the time complexity of the following code snippet.
connector.poll() {
records = externalSystem.fetchRecords()
for (record in records) {
kafkaTopic.send(record)
}
}
This code fetches records from an external system and sends each one to a Kafka topic.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: Looping over each record fetched from the external system.
- How many times: Once for every record received in each poll call.
As the number of records fetched grows, the time to process them grows too.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | 10 sends to Kafka |
| 100 | 100 sends to Kafka |
| 1000 | 1000 sends to Kafka |
Pattern observation: The work grows directly with the number of records fetched.
Time Complexity: O(n)
This means the time to process grows in a straight line with the number of records.
[X] Wrong: "The connector time stays the same no matter how many records come in."
[OK] Correct: Each record needs to be handled, so more records mean more work and more time.
Understanding how connectors scale with data helps you explain real-world system behavior clearly and confidently.
What if the connector batches records before sending? How would the time complexity change?