0
0
CProgramBeginner · 2 min read

C Program to Remove Duplicates from String

Use a loop to check each character in the string and copy it to a new string only if it has not appeared before, like in for (int i = 0; str[i] != '\0'; i++) { if (!seen[(unsigned char)str[i]]) { result[j++] = str[i]; seen[(unsigned char)str[i]] = 1; }}.
📋

Examples

Inputhello
Outputhelo
Inputprogramming
Outputprogamin
Inputaaaaa
Outputa
🧠

How to Think About It

To remove duplicates, think of checking each character one by one and remembering if you have seen it before. If you have not seen it, keep it; if you have, skip it. This way, you build a new string with only unique characters.
📐

Algorithm

1
Get the input string.
2
Create an array to keep track of seen characters.
3
Loop through each character in the input string.
4
If the character is not marked as seen, add it to the result and mark it seen.
5
End the result string with a null character.
6
Print the result string.
💻

Code

c
#include <stdio.h>
#include <string.h>

int main() {
    char str[100], result[100];
    int seen[256] = {0};
    int j = 0;

    printf("Enter a string: ");
    fgets(str, sizeof(str), stdin);
    str[strcspn(str, "\n")] = '\0';

    for (int i = 0; str[i] != '\0'; i++) {
        if (!seen[(unsigned char)str[i]]) {
            result[j++] = str[i];
            seen[(unsigned char)str[i]] = 1;
        }
    }
    result[j] = '\0';

    printf("String after removing duplicates: %s\n", result);
    return 0;
}
🔍

Dry Run

Let's trace the input "programming" through the code

1

Initialize

seen array all zeros, result empty, j=0

2

Check 'p'

'p' not seen, add to result: result='p', mark seen['p']=1, j=1

3

Check 'r'

'r' not seen, add to result: result='pr', mark seen['r']=1, j=2

4

Check 'o'

'o' not seen, add to result: result='pro', mark seen['o']=1, j=3

5

Check 'g'

'g' not seen, add to result: result='prog', mark seen['g']=1, j=4

6

Check 'r'

'r' already seen, skip

7

Check 'a'

'a' not seen, add to result: result='proga', mark seen['a']=1, j=5

8

Check 'm'

'm' not seen, add to result: result='progam', mark seen['m']=1, j=6

9

Check 'm'

'm' already seen, skip

10

Check 'i'

'i' not seen, add to result: result='progami', mark seen['i']=1, j=7

11

Check 'n'

'n' not seen, add to result: result='progamin', mark seen['n']=1, j=8

12

Check 'g'

'g' already seen, skip

13

End

Add '\0' to result: result='progamin'

CharacterSeen Before?ActionResult String
pNoAddp
rNoAddpr
oNoAddpro
gNoAddprog
rYesSkipprog
aNoAddproga
mNoAddprogam
mYesSkipprogam
iNoAddprogami
nNoAddprogamin
gYesSkipprogamin
💡

Why This Works

Step 1: Track seen characters

We use an array seen to remember which characters have appeared before.

Step 2: Build new string

We add characters to the result only if they are not marked as seen, ensuring no duplicates.

Step 3: End string properly

We add a null character '\0' at the end to mark the string's end.

🔄

Alternative Approaches

In-place removal
c
#include <stdio.h>
#include <string.h>

int main() {
    char str[100];
    int len, i, j, k;

    printf("Enter a string: ");
    fgets(str, sizeof(str), stdin);
    str[strcspn(str, "\n")] = '\0';

    len = strlen(str);
    for (i = 0; i < len; i++) {
        for (j = i + 1; j < len;) {
            if (str[i] == str[j]) {
                for (k = j; k < len; k++) {
                    str[k] = str[k + 1];
                }
                len--;
            } else {
                j++;
            }
        }
    }

    printf("String after removing duplicates: %s\n", str);
    return 0;
}
This method removes duplicates directly in the original string but is less efficient due to nested loops.
Using sorting
c
#include <stdio.h>
#include <string.h>
#include <stdlib.h>

int cmp(const void *a, const void *b) {
    return (*(char *)a - *(char *)b);
}

int main() {
    char str[100];
    int len, i, j = 0;

    printf("Enter a string: ");
    fgets(str, sizeof(str), stdin);
    str[strcspn(str, "\n")] = '\0';

    len = strlen(str);
    qsort(str, len, sizeof(char), cmp);

    for (i = 0; i < len; i++) {
        if (i == 0 || str[i] != str[i - 1]) {
            str[j++] = str[i];
        }
    }
    str[j] = '\0';

    printf("String after removing duplicates: %s\n", str);
    return 0;
}
Sorting groups duplicates together, making it easy to remove them, but changes the original order.

Complexity: O(n) time, O(1) space

Time Complexity

The program loops through the string once, checking each character, so it runs in linear time O(n) where n is the string length.

Space Complexity

It uses a fixed-size array of 256 integers to track characters, so space is O(1) constant.

Which Approach is Fastest?

Using a seen array is faster than nested loops or sorting because it only requires one pass and constant extra space.

ApproachTimeSpaceBest For
Seen arrayO(n)O(1)Fast and preserves order
In-place nested loopsO(n^2)O(1)No extra space but slower
Sorting then removeO(n log n)O(1)Simple but changes order
💡
Use an array to track seen characters for fast duplicate detection.
⚠️
Forgetting to add the null character at the end of the new string causes garbage output.