wc (word, line, character count) in Linux CLI - Time & Space Complexity
We want to understand how the time taken by the wc command changes as the size of the input file grows.
Specifically, how does counting words, lines, or characters scale with bigger files?
Analyze the time complexity of this command:
wc filename.txt
This command reads the file and counts the number of lines, words, and characters.
The command processes the file content by reading it from start to end.
- Primary operation: Reading each character in the file once.
- How many times: Once per character in the file.
The time taken grows directly with the size of the file because every character must be checked.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 characters | About 10 reads |
| 100 characters | About 100 reads |
| 1000 characters | About 1000 reads |
Pattern observation: Doubling the file size roughly doubles the work done.
Time Complexity: O(n)
This means the time to count lines, words, and characters grows in direct proportion to the file size.
[X] Wrong: "Counting words or lines is faster than counting characters because there are fewer words or lines."
[OK] Correct: The command still reads every character to identify word and line boundaries, so it must process the whole file anyway.
Understanding how simple commands like wc scale helps you think about processing large data efficiently in real tasks.
What if we only want to count lines using wc -l? How would the time complexity change?