zip and unzip in Linux CLI - Time & Space Complexity
When we use zip and unzip commands, we want to know how the time they take grows as the files get bigger or more numerous.
We ask: How does the work increase when we add more files or larger files?
Analyze the time complexity of the following code snippet.
zip archive.zip file1.txt file2.txt file3.txt
unzip archive.zip -d output_folder
This code compresses three files into one archive, then extracts them to a folder.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: Reading and compressing each file during
zip, and reading and decompressing each file duringunzip. - How many times: Once per file, processing all bytes inside each file.
As you add more files or bigger files, the time to zip or unzip grows roughly in proportion to the total size of all files combined.
| Input Size (total MB) | Approx. Operations |
|---|---|
| 10 | Processes about 10 MB of data |
| 100 | Processes about 100 MB of data |
| 1000 | Processes about 1000 MB (1 GB) of data |
Pattern observation: Doubling the total file size roughly doubles the work done.
Time Complexity: O(n)
This means the time grows linearly with the total size of the files being zipped or unzipped.
[X] Wrong: "Zipping many small files is always faster than one big file of the same size."
[OK] Correct: The total data size matters most, not how it is split. Overhead for many files is small compared to reading all bytes.
Understanding how compression tools scale helps you reason about performance in real tasks, showing you can think about how scripts behave with bigger data.
"What if we used a compression method that only compressed parts of files? How would the time complexity change?"