Bash-scriptingHow-ToBeginner · 2 min read

Bash Script to Remove Duplicate Characters from String

Use echo "$string" | fold -w1 | awk '!seen[$0]++' | tr -d '\n' in Bash to remove duplicate characters from a string while preserving order.

📋

Examples

Inputhello

Outputhelo

Inputbanana

Outputban

Inputaaaaa

Outputa

🧠

How to Think About It

To remove duplicate characters, think of checking each character one by one and keeping only the first time it appears. You can split the string into single characters, remember which ones you saw, and skip repeats.

📐

Algorithm

Get the input string.

Split the string into individual characters.

Keep track of characters already seen.

For each character, if it is not seen before, keep it; otherwise, skip it.

Join the kept characters back into a string.

Return the resulting string without duplicates.

💻

Code

bash

#!/bin/bash

input="banana"

# Remove duplicate characters while preserving order
result=$(echo "$input" | fold -w1 | awk '!seen[$0]++' | tr -d '\n')

# Print the result
printf "%s\n" "$result"

Output

ban

🔍

Dry Run

Let's trace the input 'banana' through the code

Split string into characters

b a n a n a

Filter unique characters with awk

b (not seen before, keep) a (not seen before, keep) n (not seen before, keep) a (seen before, skip) n (seen before, skip) a (seen before, skip)

Join characters back

ban

Character	Seen Before?	Action
b	No	Keep
a	No	Keep
n	No	Keep
a	Yes	Skip
n	Yes	Skip
a	Yes	Skip

💡

Why This Works

Step 1: Splitting the string

Using fold -w1 splits the string into one character per line so we can process each character separately.

Step 2: Filtering duplicates

The awk '!seen[$0]++' command keeps only the first occurrence of each character by tracking seen characters in an array.

Step 3: Rejoining characters

Finally, tr -d '\n' removes newlines to join the characters back into a single string without duplicates.

🔄

Alternative Approaches

Using Bash associative array

bash

#!/bin/bash
input="banana"
declare -A seen
result=""
for (( i=0; i<${#input}; i++ )); do
  c=${input:i:1}
  if [[ -z ${seen[$c]} ]]; then
    result+=$c
    seen[$c]=1
  fi
done
printf "%s\n" "$result"

This method uses pure Bash without external commands but requires Bash 4+ for associative arrays.

Using grep and awk

bash

#!/bin/bash
input="banana"
result=$(echo "$input" | grep -o . | awk '!a[$0]++' | tr -d '\n')
printf "%s\n" "$result"

Similar to the main method but uses <code>grep -o .</code> to split characters instead of <code>fold</code>.

⚡

Complexity: O(n) time, O(n) space

Time Complexity

The script processes each character once, so time grows linearly with string length.

Space Complexity

It stores seen characters in memory, which can grow up to the number of unique characters.

Which Approach is Fastest?

The Bash associative array method avoids external commands and can be faster for large strings but requires Bash 4+. The pipeline with awk is simpler and portable.

Approach	Time	Space	Best For
awk pipeline	O(n)	O(n)	Simple scripts, portability
Bash associative array	O(n)	O(n)	Performance, no external commands
grep and awk	O(n)	O(n)	Alternative splitting method

💡

Use awk '!seen[$0]++' to easily filter unique lines or characters in Bash pipelines.

⚠️

Forgetting to remove newlines after filtering duplicates causes output to be split across lines instead of a single string.