Process substitution (<() and >()) in Bash Scripting - Time & Space Complexity
We want to understand how the time cost changes when using process substitution in bash scripts.
Specifically, how the script's execution time grows as input size changes when using <() and ()>.
Analyze the time complexity of the following bash snippet using process substitution.
diff <(sort file1.txt) <(sort file2.txt)
This code compares two files by sorting each and then running diff on the sorted outputs using process substitution.
Look for loops or repeated commands inside the process substitution.
- Primary operation: Sorting each file with
sort, which reads and processes all lines. - How many times: Each file is processed once by
sort, thendiffcompares the sorted outputs line by line.
As the size of each file grows, the sorting and diffing take more time.
| Input Size (n lines per file) | Approx. Operations |
|---|---|
| 10 | Sort ~10 lines + diff ~10 lines |
| 100 | Sort ~100 lines + diff ~100 lines |
| 1000 | Sort ~1000 lines + diff ~1000 lines |
Pattern observation: Sorting cost grows faster than line count, diff grows roughly linearly.
Time Complexity: O(n log n)
This means the script's time mainly depends on sorting each file, which grows a bit faster than just reading lines.
[X] Wrong: "Process substitution runs instantly and does not add to time complexity."
[OK] Correct: Process substitution runs the commands inside it fully, so their time cost counts just like normal commands.
Understanding how process substitution affects script time helps you write efficient bash scripts and explain your choices clearly in interviews.
What if we replaced sort with a command that reads the file once without sorting? How would the time complexity change?