0
0
Bash-scriptingHow-ToBeginner · 2 min read

Bash Script to Find Frequency of Elements

Use sort | uniq -c in Bash to find the frequency of elements, for example: printf "%s\n" "${array[@]}" | sort | uniq -c counts each element's occurrences.
📋

Examples

Inputapple apple banana apple orange banana
Output3 apple 2 banana 1 orange
Inputdog cat dog dog bird cat
Output3 dog 2 cat 1 bird
Input
Output
🧠

How to Think About It

To find how often each element appears, first list all elements line by line, then sort them so identical items group together. Finally, count how many times each unique item appears using uniq -c.
📐

Algorithm

1
Get the list of elements as input.
2
Print each element on a new line.
3
Sort the list to group identical elements together.
4
Use a command to count consecutive duplicate lines.
5
Display the count alongside each unique element.
💻

Code

bash
elements=(apple apple banana apple orange banana)
printf "%s\n" "${elements[@]}" | sort | uniq -c
Output
3 apple 2 banana 1 orange
🔍

Dry Run

Let's trace the example array [apple, apple, banana, apple, orange, banana] through the code

1

Print elements line by line

apple apple banana apple orange banana

2

Sort elements

apple apple apple banana banana orange

3

Count unique elements

3 apple 2 banana 1 orange

Element
apple
apple
banana
apple
orange
banana
💡

Why This Works

Step 1: Print elements line by line

Using printf "%s\n" prints each element on its own line, preparing for sorting.

Step 2: Sort elements

Sorting groups identical elements together so counting duplicates is easy.

Step 3: Count unique elements

uniq -c counts how many times each unique line appears consecutively.

🔄

Alternative Approaches

Using associative arrays in Bash
bash
declare -A freq
for e in "${elements[@]}"; do ((freq[$e]++)); done
for k in "${!freq[@]}"; do echo "$k: ${freq[$k]}"; done
This method uses Bash's associative arrays for counting, which is faster for large data but requires Bash 4+.
Using awk
bash
printf "%s\n" "${elements[@]}" | awk '{count[$0]++} END {for (word in count) print count[word], word}'
Awk counts frequencies in one step and can be more flexible for complex processing.

Complexity: O(n log n) time, O(n) space

Time Complexity

Sorting the list takes O(n log n) time, which dominates the counting step that is O(n).

Space Complexity

Extra space is needed to store the sorted list and counts, so O(n) space is used.

Which Approach is Fastest?

Using associative arrays in Bash or awk can be faster for large inputs since they avoid sorting, but sort | uniq -c is simpler and widely supported.

ApproachTimeSpaceBest For
sort + uniq -cO(n log n)O(n)Simple scripts, small to medium data
Bash associative arraysO(n)O(n)Large data, Bash 4+ environments
awk countingO(n)O(n)Flexible processing, large data
💡
Always sort your list before using uniq -c to get correct frequency counts.
⚠️
Forgetting to sort the input before uniq -c causes incorrect frequency results.