Processing CSV files in Bash Scripting - Time & Space Complexity
When working with CSV files in bash, it is important to understand how the time to process the file grows as the file gets bigger.
We want to know how the script's running time changes when the number of lines in the CSV increases.
Analyze the time complexity of the following code snippet.
#!/bin/bash
filename="data.csv"
while IFS=, read -r col1 col2 col3
do
echo "Column 1: $col1, Column 2: $col2"
done < "$filename"
This script reads a CSV file line by line and prints the first two columns of each line.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: The while loop reads and processes each line of the CSV file.
- How many times: Once for every line in the file (n times, where n is the number of lines).
As the number of lines in the CSV file grows, the script processes each line one by one.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | About 10 line reads and prints |
| 100 | About 100 line reads and prints |
| 1000 | About 1000 line reads and prints |
Pattern observation: The number of operations grows directly with the number of lines. Double the lines, double the work.
Time Complexity: O(n)
This means the time to process the CSV grows in a straight line with the number of lines in the file.
[X] Wrong: "Reading a CSV file is always fast and does not depend on file size."
[OK] Correct: The script reads each line one by one, so bigger files take more time to process.
Understanding how your script scales with input size shows you can write efficient automation that handles real data sizes confidently.
"What if we added a nested loop inside the while loop to process each column separately? How would the time complexity change?"