Bash Script to Find Frequency of Elements
Use
sort | uniq -c in Bash to find the frequency of elements, for example: printf "%s\n" "${array[@]}" | sort | uniq -c counts each element's occurrences.Examples
Inputapple apple banana apple orange banana
Output3 apple
2 banana
1 orange
Inputdog cat dog dog bird cat
Output3 dog
2 cat
1 bird
Input
Output
How to Think About It
To find how often each element appears, first list all elements line by line, then sort them so identical items group together. Finally, count how many times each unique item appears using
uniq -c.Algorithm
1
Get the list of elements as input.2
Print each element on a new line.3
Sort the list to group identical elements together.4
Use a command to count consecutive duplicate lines.5
Display the count alongside each unique element.Code
bash
elements=(apple apple banana apple orange banana) printf "%s\n" "${elements[@]}" | sort | uniq -c
Output
3 apple
2 banana
1 orange
Dry Run
Let's trace the example array [apple, apple, banana, apple, orange, banana] through the code
1
Print elements line by line
apple apple banana apple orange banana
2
Sort elements
apple apple apple banana banana orange
3
Count unique elements
3 apple 2 banana 1 orange
| Element |
|---|
| apple |
| apple |
| banana |
| apple |
| orange |
| banana |
Why This Works
Step 1: Print elements line by line
Using printf "%s\n" prints each element on its own line, preparing for sorting.
Step 2: Sort elements
Sorting groups identical elements together so counting duplicates is easy.
Step 3: Count unique elements
uniq -c counts how many times each unique line appears consecutively.
Alternative Approaches
Using associative arrays in Bash
bash
declare -A freq for e in "${elements[@]}"; do ((freq[$e]++)); done for k in "${!freq[@]}"; do echo "$k: ${freq[$k]}"; done
This method uses Bash's associative arrays for counting, which is faster for large data but requires Bash 4+.
Using awk
bash
printf "%s\n" "${elements[@]}" | awk '{count[$0]++} END {for (word in count) print count[word], word}'
Awk counts frequencies in one step and can be more flexible for complex processing.
Complexity: O(n log n) time, O(n) space
Time Complexity
Sorting the list takes O(n log n) time, which dominates the counting step that is O(n).
Space Complexity
Extra space is needed to store the sorted list and counts, so O(n) space is used.
Which Approach is Fastest?
Using associative arrays in Bash or awk can be faster for large inputs since they avoid sorting, but sort | uniq -c is simpler and widely supported.
| Approach | Time | Space | Best For |
|---|---|---|---|
| sort + uniq -c | O(n log n) | O(n) | Simple scripts, small to medium data |
| Bash associative arrays | O(n) | O(n) | Large data, Bash 4+ environments |
| awk counting | O(n) | O(n) | Flexible processing, large data |
Always sort your list before using
uniq -c to get correct frequency counts.Forgetting to sort the input before
uniq -c causes incorrect frequency results.