0
0
Bash-scriptingHow-ToBeginner · 2 min read

Bash Script to Remove Duplicates from Array

Use a Bash script with an associative array to track seen elements and print only unique ones, like: declare -A seen; for i in "${arr[@]}"; do [[ -z ${seen[$i]} ]] && echo "$i" && seen[$i]=1; done.
📋

Examples

Inputarr=(apple banana apple orange banana)
Outputapple banana orange
Inputarr=(1 2 3 2 1 4 5)
Output1 2 3 4 5
Inputarr=()
Output
🧠

How to Think About It

To remove duplicates from an array in Bash, think about remembering which items you have already seen as you check each element. You can use a special kind of array called an associative array to keep track of these items. Then, only print or keep the elements that you find for the first time.
📐

Algorithm

1
Create an empty associative array to store seen elements.
2
Loop through each element in the original array.
3
Check if the element is already in the associative array.
4
If not, print or save the element and mark it as seen.
5
Continue until all elements are processed.
💻

Code

bash
#!/bin/bash
arr=(apple banana apple orange banana)
declare -A seen
for item in "${arr[@]}"; do
  if [[ -z ${seen[$item]} ]]; then
    echo "$item"
    seen[$item]=1
  fi
done
Output
apple banana orange
🔍

Dry Run

Let's trace the array (apple banana apple orange banana) through the code

1

Initialize associative array

seen is empty

2

Process first element 'apple'

seen[apple] is empty, print 'apple', set seen[apple]=1

3

Process second element 'banana'

seen[banana] is empty, print 'banana', set seen[banana]=1

4

Process third element 'apple'

seen[apple] is 1, skip printing

5

Process fourth element 'orange'

seen[orange] is empty, print 'orange', set seen[orange]=1

6

Process fifth element 'banana'

seen[banana] is 1, skip printing

ElementSeen Before?Action
appleNoPrint apple, mark seen
bananaNoPrint banana, mark seen
appleYesSkip
orangeNoPrint orange, mark seen
bananaYesSkip
💡

Why This Works

Step 1: Use associative array for tracking

The declare -A seen creates a map where keys are array elements and values mark if seen.

Step 2: Check each element once

For each element, the script checks if it is already in seen to avoid duplicates.

Step 3: Print only unique elements

Only elements not marked as seen are printed and then marked to prevent future duplicates.

🔄

Alternative Approaches

Using sort and uniq
bash
arr=(apple banana apple orange banana)
printf "%s\n" "${arr[@]}" | sort | uniq
This sorts the array and removes duplicates but changes the order of elements.
Using a temporary file and grep
bash
arr=(apple banana apple orange banana)
for i in "${arr[@]}"; do
  if ! grep -qx "$i" temp.txt 2>/dev/null; then
    echo "$i"
    echo "$i" >> temp.txt
  fi
done
rm -f temp.txt
This uses a file to track seen elements but is slower and less clean.

Complexity: O(n) time, O(n) space

Time Complexity

The script loops once through all elements, checking and inserting in an associative array in constant time, so it runs in O(n).

Space Complexity

It uses extra space proportional to the number of unique elements to store them in the associative array, so O(n).

Which Approach is Fastest?

Using an associative array is fastest and preserves order, while sorting methods are slower and reorder elements.

ApproachTimeSpaceBest For
Associative arrayO(n)O(n)Preserving order, fast lookup
Sort and uniqO(n log n)O(n)Simple scripts, order not important
File and grepO(n^2)O(n)Very basic Bash without associative arrays
💡
Use an associative array in Bash 4+ to efficiently track and remove duplicates while preserving order.
⚠️
Trying to remove duplicates with normal arrays without tracking seen elements causes repeated outputs.