CProgramBeginner · 2 min read

C Program to Remove Duplicates from String

Use a loop to check each character in the string and copy it to a new string only if it has not appeared before, like in

for (int i = 0; str[i] != '\0'; i++) { if (!seen[(unsigned char)str[i]]) { result[j++] = str[i]; seen[(unsigned char)str[i]] = 1; }}

📋

Examples

Inputhello

Outputhelo

Inputprogramming

Outputprogamin

Inputaaaaa

Outputa

🧠

How to Think About It

To remove duplicates, think of checking each character one by one and remembering if you have seen it before. If you have not seen it, keep it; if you have, skip it. This way, you build a new string with only unique characters.

📐

Algorithm

Get the input string.

Create an array to keep track of seen characters.

Loop through each character in the input string.

If the character is not marked as seen, add it to the result and mark it seen.

End the result string with a null character.

Print the result string.

💻

Code

#include <stdio.h>
#include <string.h>

int main() {
    char str[100], result[100];
    int seen[256] = {0};
    int j = 0;

    printf("Enter a string: ");
    fgets(str, sizeof(str), stdin);
    str[strcspn(str, "\n")] = '\0';

    for (int i = 0; str[i] != '\0'; i++) {
        if (!seen[(unsigned char)str[i]]) {
            result[j++] = str[i];
            seen[(unsigned char)str[i]] = 1;
        }
    }
    result[j] = '\0';

    printf("String after removing duplicates: %s\n", result);
    return 0;
}

🔍

Dry Run

Let's trace the input "programming" through the code

Initialize

seen array all zeros, result empty, j=0

Check 'p'

'p' not seen, add to result: result='p', mark seen['p']=1, j=1

Check 'r'

'r' not seen, add to result: result='pr', mark seen['r']=1, j=2

Check 'o'

'o' not seen, add to result: result='pro', mark seen['o']=1, j=3

Check 'g'

'g' not seen, add to result: result='prog', mark seen['g']=1, j=4

Check 'r'

'r' already seen, skip

Check 'a'

'a' not seen, add to result: result='proga', mark seen['a']=1, j=5

Check 'm'

'm' not seen, add to result: result='progam', mark seen['m']=1, j=6

Check 'm'

'm' already seen, skip

Check 'i'

'i' not seen, add to result: result='progami', mark seen['i']=1, j=7

Check 'n'

'n' not seen, add to result: result='progamin', mark seen['n']=1, j=8

Check 'g'

'g' already seen, skip

End

Add '\0' to result: result='progamin'

Character	Seen Before?	Action	Result String
p	No	Add	p
r	No	Add	pr
o	No	Add	pro
g	No	Add	prog
r	Yes	Skip	prog
a	No	Add	proga
m	No	Add	progam
m	Yes	Skip	progam
i	No	Add	progami
n	No	Add	progamin
g	Yes	Skip	progamin

💡

Why This Works

Step 1: Track seen characters

We use an array seen to remember which characters have appeared before.

Step 2: Build new string

We add characters to the result only if they are not marked as seen, ensuring no duplicates.

Step 3: End string properly

We add a null character '\0' at the end to mark the string's end.

🔄

Alternative Approaches

In-place removal

#include <stdio.h>
#include <string.h>

int main() {
    char str[100];
    int len, i, j, k;

    printf("Enter a string: ");
    fgets(str, sizeof(str), stdin);
    str[strcspn(str, "\n")] = '\0';

    len = strlen(str);
    for (i = 0; i < len; i++) {
        for (j = i + 1; j < len;) {
            if (str[i] == str[j]) {
                for (k = j; k < len; k++) {
                    str[k] = str[k + 1];
                }
                len--;
            } else {
                j++;
            }
        }
    }

    printf("String after removing duplicates: %s\n", str);
    return 0;
}

This method removes duplicates directly in the original string but is less efficient due to nested loops.

Using sorting

#include <stdio.h>
#include <string.h>
#include <stdlib.h>

int cmp(const void *a, const void *b) {
    return (*(char *)a - *(char *)b);
}

int main() {
    char str[100];
    int len, i, j = 0;

    printf("Enter a string: ");
    fgets(str, sizeof(str), stdin);
    str[strcspn(str, "\n")] = '\0';

    len = strlen(str);
    qsort(str, len, sizeof(char), cmp);

    for (i = 0; i < len; i++) {
        if (i == 0 || str[i] != str[i - 1]) {
            str[j++] = str[i];
        }
    }
    str[j] = '\0';

    printf("String after removing duplicates: %s\n", str);
    return 0;
}

Sorting groups duplicates together, making it easy to remove them, but changes the original order.

⚡

Complexity: O(n) time, O(1) space

Time Complexity

The program loops through the string once, checking each character, so it runs in linear time O(n) where n is the string length.

Space Complexity

It uses a fixed-size array of 256 integers to track characters, so space is O(1) constant.

Which Approach is Fastest?

Using a seen array is faster than nested loops or sorting because it only requires one pass and constant extra space.

Approach	Time	Space	Best For
Seen array	O(n)	O(1)	Fast and preserves order
In-place nested loops	O(n^2)	O(1)	No extra space but slower
Sorting then remove	O(n log n)	O(1)	Simple but changes order

💡

Use an array to track seen characters for fast duplicate detection.

⚠️

Forgetting to add the null character at the end of the new string causes garbage output.