0
0
DSA Cprogramming

Edit Distance Problem Levenshtein in DSA C

Choose your learning style9 modes available
Mental Model
Find the smallest number of changes needed to turn one word into another by adding, removing, or changing letters.
Analogy: Imagine correcting a typo in a word by either erasing a letter, writing a new letter, or changing a wrong letter to the right one, counting each change as one step.
word1: c a t
word2: c a r

We compare letters one by one and decide if we need to change, add, or remove letters to match.
Dry Run Walkthrough
Input: word1: "cat", word2: "car"
Goal: Find the minimum steps to change "cat" into "car"
Step 1: Compare first letters 'c' and 'c', they match, no change needed
cost matrix:
  0 1 2 3
0 0 1 2 3
1 1 0 1 2
2 2
3 3
Why: Matching letters cost 0, so we keep the cost from previous step
Step 2: Compare second letters 'a' and 'a', they match, no change needed
cost matrix:
  0 1 2 3
0 0 1 2 3
1 1 0 1 2
2 2 1 2
3 3
Why: Matching letters cost 0, so cost stays same as previous
Step 3: Compare third letters 't' and 'r', they differ, consider replace, insert, delete
cost matrix:
  0 1 2 3
0 0 1 2 3
1 1 0 1 2
2 2 1 2 2
3 3 2 2 1
Why: Replace 't' with 'r' costs 1, which is minimum among insert/delete/replace
Result:
Minimum edit distance = 1
Final cost matrix:
  0 1 2 3
0 0 1 2 3
1 1 0 1 2
2 2 1 2 2
3 3 2 2 1
Annotated Code
DSA C
#include <stdio.h>
#include <string.h>

int min(int a, int b, int c) {
    int m = a < b ? a : b;
    return m < c ? m : c;
}

int editDistance(char *word1, char *word2) {
    int len1 = strlen(word1);
    int len2 = strlen(word2);
    int dp[len1 + 1][len2 + 1];

    for (int i = 0; i <= len1; i++) {
        dp[i][0] = i; // cost of deleting all letters from word1
    }
    for (int j = 0; j <= len2; j++) {
        dp[0][j] = j; // cost of inserting all letters to word1
    }

    for (int i = 1; i <= len1; i++) {
        for (int j = 1; j <= len2; j++) {
            if (word1[i - 1] == word2[j - 1]) {
                dp[i][j] = dp[i - 1][j - 1]; // letters match, no cost
            } else {
                dp[i][j] = 1 + min(
                    dp[i - 1][j],    // delete
                    dp[i][j - 1],    // insert
                    dp[i - 1][j - 1] // replace
                );
            }
        }
    }
    return dp[len1][len2];
}

int main() {
    char word1[] = "cat";
    char word2[] = "car";
    int dist = editDistance(word1, word2);
    printf("Minimum edit distance = %d\n", dist);
    return 0;
}
dp[i][0] = i; // cost of deleting all letters from word1
initialize first column: cost to delete letters to match empty string
dp[0][j] = j; // cost of inserting all letters to word1
initialize first row: cost to insert letters to match word2
if (word1[i - 1] == word2[j - 1]) { dp[i][j] = dp[i - 1][j - 1]; // letters match, no cost } else { dp[i][j] = 1 + min( dp[i - 1][j], // delete dp[i][j - 1], // insert dp[i - 1][j - 1] // replace ); }
if letters match, carry cost; else choose min cost among delete, insert, replace plus one
OutputSuccess
Minimum edit distance = 1
Complexity Analysis
Time: O(m*n) because we fill a table of size m by n where m and n are the lengths of the two words
Space: O(m*n) because we store the edit distances for all substring pairs in a 2D table
vs Alternative: Naive recursive approach has exponential time due to repeated subproblems; dynamic programming reduces it to polynomial time by storing results
Edge Cases
One or both words are empty strings
The edit distance equals the length of the non-empty word (all insertions or deletions)
DSA C
dp[i][0] = i; // cost of deleting all letters from word1
Both words are identical
Edit distance is zero since no changes are needed
DSA C
if (word1[i - 1] == word2[j - 1]) {
    dp[i][j] = dp[i - 1][j - 1];
Words differ completely
Algorithm counts minimum edits by considering all operations at each step
DSA C
dp[i][j] = 1 + min(dp[i - 1][j], dp[i][j - 1], dp[i - 1][j - 1]);
When to Use This Pattern
When you need to find how similar two strings are by counting changes, use the Edit Distance pattern because it systematically counts minimal edits.
Common Mistakes
Mistake: Not initializing the first row and column of the dp table correctly
Fix: Initialize dp[i][0] = i and dp[0][j] = j to represent base cases of empty strings
Mistake: Confusing indices when accessing characters of the strings
Fix: Use i-1 and j-1 to access characters because dp indices start from 1 for convenience
Mistake: Forgetting to add 1 when choosing among insert, delete, replace operations
Fix: Add 1 to the minimum of the three operations to count the current edit
Summary
Calculates the minimum number of insertions, deletions, or replacements to convert one word into another.
Use when you want to measure how different two strings are or find the minimal edits to transform one string into another.
The key insight is to build a table of solutions for smaller parts and use them to solve the bigger problem efficiently.