0
0
Bash Scriptingscripting~10 mins

sort and uniq in pipelines in Bash Scripting - Step-by-Step Execution

Choose your learning style9 modes available
Concept Flow - sort and uniq in pipelines
Input lines
sort command
uniq command
Output unique sorted lines
The pipeline takes input lines, sorts them alphabetically, then filters out duplicates to output unique sorted lines.
Execution Sample
Bash Scripting
printf "apple\nbanana\napple\ncherry\nbanana\n" | sort | uniq
This command sorts the list of fruits and removes duplicates, showing unique sorted fruits.
Execution Table
StepCommandInputOutputExplanation
1printfnoneapple banana apple cherry banana Prints the list of fruits with duplicates
2sortapple banana apple cherry banana apple apple banana banana cherry Sorts lines alphabetically, duplicates stay
3uniqapple apple banana banana cherry apple banana cherry Removes adjacent duplicate lines
4endN/Aapple banana cherry Pipeline ends with unique sorted output
💡 uniq stops after processing all sorted lines, outputting unique lines only
Variable Tracker
VariableStartAfter printfAfter sortAfter uniqFinal
linesemptyapple banana apple cherry banana apple apple banana banana cherry apple banana cherry apple banana cherry
Key Moments - 3 Insights
Why does uniq only remove duplicates after sort, not before?
uniq only removes duplicates that are next to each other. Sorting groups duplicates together so uniq can remove them, as shown in execution_table step 2 and 3.
What happens if we use uniq before sort?
uniq would only remove duplicates that are adjacent in the original input order, missing duplicates separated by other lines. This is why sorting first is important.
Why do we use a pipeline with | between commands?
The pipe | sends the output of one command as input to the next, allowing sort to receive the printed lines and uniq to receive the sorted lines, as shown in the concept_flow.
Visual Quiz - 3 Questions
Test your understanding
Look at the execution table, what is the output after the sort command?
Aapple banana apple cherry banana
Bapple apple banana banana cherry
Capple banana cherry
Dbanana apple cherry
💡 Hint
Check the 'Output' column for step 2 in the execution_table.
At which step does the pipeline remove duplicate lines?
AStep 3 - uniq
BStep 1 - printf
CStep 2 - sort
DStep 4 - end
💡 Hint
Look at the 'Explanation' column in execution_table for step 3.
If we remove the sort command, how would the uniq output change?
AIt would still remove all duplicates correctly.
BIt would output the lines in sorted order.
CIt would remove only adjacent duplicates, missing others.
DIt would cause an error.
💡 Hint
Refer to key_moments about uniq behavior without sorting.
Concept Snapshot
Use 'sort | uniq' in a pipeline to get unique sorted lines.
'sort' arranges lines alphabetically.
'uniq' removes only adjacent duplicates.
Pipe output of sort into uniq for correct unique filtering.
Without sort, uniq misses non-adjacent duplicates.
Full Transcript
This visual execution shows how the bash pipeline 'sort | uniq' works. First, input lines are printed with duplicates. Then 'sort' arranges these lines alphabetically but keeps duplicates. Next, 'uniq' removes duplicates only if they are next to each other. Sorting first groups duplicates together so uniq can remove them. The final output is unique sorted lines. Key points include that uniq alone only removes adjacent duplicates, so sorting first is essential. The pipeline uses the pipe symbol to send output from one command to the next. This step-by-step trace helps beginners see how each command changes the data and why the order matters.