Wire encryption for data in transit in Hadoop - Time & Space Complexity
When Hadoop encrypts data while it moves between nodes, it adds extra steps. We want to see how these extra steps affect the total time it takes.
How does encrypting data during transfer change the work Hadoop does as data size grows?
Analyze the time complexity of the following code snippet.
// Pseudocode for Hadoop data transfer with encryption
for each data block in file:
read block from disk
encrypt block
send encrypted block over network
wait for acknowledgement
This code reads each block of data, encrypts it, sends it, and waits for confirmation before moving on.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: Loop over each data block to encrypt and send.
- How many times: Once per data block, so the number of blocks equals input size.
As the file size grows, the number of blocks grows, so the encryption and sending steps repeat more times.
| Input Size (n blocks) | Approx. Operations |
|---|---|
| 10 | 10 encryptions and sends |
| 100 | 100 encryptions and sends |
| 1000 | 1000 encryptions and sends |
Pattern observation: The work grows directly with the number of blocks; doubling blocks doubles work.
Time Complexity: O(n)
This means the time to encrypt and send data grows in a straight line with the amount of data.
[X] Wrong: "Encryption adds a fixed time, so it doesn't affect how time grows with data size."
[OK] Correct: Encryption happens for every block, so as data grows, encryption time grows too, not just a fixed cost.
Understanding how encryption affects data transfer time helps you explain real-world trade-offs between security and speed. This skill shows you think about practical system design.
"What if encryption was done once for the whole file instead of per block? How would the time complexity change?"