0
0
Linux CLIscripting~5 mins

sort and uniq in Linux CLI - Time & Space Complexity

Choose your learning style9 modes available
Time Complexity: sort and uniq
O(n log n)
Understanding Time Complexity

When using sort and uniq commands together, it is important to understand how the time taken grows as the input size increases.

We want to know how the execution time changes when the input file gets bigger.

Scenario Under Consideration

Analyze the time complexity of the following code snippet.

sort input.txt | uniq

This code sorts the lines of a file and then removes duplicate lines that are next to each other.

Identify Repeating Operations

Identify the loops, recursion, array traversals that repeat.

  • Primary operation: Sorting all lines in the input file.
  • How many times: The sorting process compares and rearranges lines multiple times depending on input size.
  • Secondary operation: uniq scans the sorted lines once to remove duplicates.
  • How many times: One pass through all lines.
How Execution Grows With Input

As the number of lines grows, sorting takes more time because it compares many pairs of lines. Removing duplicates is faster since it just checks neighbors once.

Input Size (n)Approx. Operations
10About 30 to 50 comparisons for sorting, plus 10 checks for duplicates
100About 700 to 1000 comparisons for sorting, plus 100 checks for duplicates
1000About 10,000 to 15,000 comparisons for sorting, plus 1000 checks for duplicates

Pattern observation: Sorting grows faster than linearly, while duplicate removal grows linearly.

Final Time Complexity

Time Complexity: O(n log n)

This means the time to sort and remove duplicates grows a bit faster than just the number of lines, mainly because sorting takes more work as the file gets bigger.

Common Mistake

[X] Wrong: "The whole command runs in linear time because uniq just scans once."

[OK] Correct: Sorting is the main cost and it takes more than linear time, so the total time is dominated by sorting, not just scanning.

Interview Connect

Understanding how sorting affects performance helps you explain real-world scripts and commands clearly. This skill shows you can think about how tools behave as data grows.

Self-Check

"What if we used sort -u instead of sort | uniq? How would the time complexity change?"