0
0
DSA Cprogramming

Longest Common Subsequence in DSA C

Choose your learning style9 modes available
Mental Model
Find the longest sequence of characters that appear in the same order in both strings, but not necessarily consecutively.
Analogy: Imagine two friends telling stories with some common events. The longest common subsequence is like the longest list of shared events they both mention in the same order, even if they skip some details.
String1: A B C D G H
String2: A E D F H R

LCS matrix (partial):
    Ø A E D F H R
Ø [0 0 0 0 0 0 0]
A [0 1 1 1 1 1 1]
B [0 1 1 1 1 1 1]
C [0 1 1 1 1 1 1]
D [0 1 1 2 2 2 2]
G [0 1 1 2 2 2 2]
H [0 1 1 2 2 3 3]
Dry Run Walkthrough
Input: String1: "ABCDGH", String2: "AEDFHR"
Goal: Find the longest common subsequence between the two strings
Step 1: Initialize a 2D array dp with zeros for lengths of substrings
dp matrix filled with zeros for dimensions (7 x 7) including empty prefix rows and columns
Why: We need a base to build subsequence lengths from empty prefixes
Step 2: Compare characters 'A' (String1[0]) and 'A' (String2[0]), they match, set dp[1][1] = dp[0][0] + 1 = 1
dp[1][1] = 1, rest zeros in first row and column
Why: Matching characters increase subsequence length by 1
Step 3: Compare 'B' (String1[1]) and 'E' (String2[1]), no match, set dp[2][2] = max(dp[1][2], dp[2][1]) = 1
dp[2][2] = 1, subsequence length remains 1
Why: No match, so take the best subsequence length from previous prefixes
Step 4: Compare 'D' (String1[3]) and 'D' (String2[2]), match, set dp[4][3] = dp[3][2] + 1 = 2
dp[4][3] = 2, subsequence length increases
Why: Matching characters extend the subsequence length
Step 5: Compare 'H' (String1[5]) and 'H' (String2[4]), match, set dp[6][5] = dp[5][4] + 1 = 3
dp[6][5] = 3, longest subsequence length found
Why: Last matching characters increase subsequence length to final value
Step 6: Trace back from dp[6][5] to find subsequence characters 'A', 'D', 'H'
LCS is "ADH"
Why: Backtracking recovers the actual subsequence from dp matrix
Result:
Longest Common Subsequence: "ADH" with length 3
Annotated Code
DSA C
#include <stdio.h>
#include <string.h>

// Function to find LCS length and print the subsequence
void longestCommonSubsequence(char* X, char* Y) {
    int m = strlen(X);
    int n = strlen(Y);
    int dp[m+1][n+1];

    // Build dp matrix
    for (int i = 0; i <= m; i++) {
        for (int j = 0; j <= n; j++) {
            if (i == 0 || j == 0) {
                dp[i][j] = 0; // base case: empty string
            } else if (X[i-1] == Y[j-1]) {
                dp[i][j] = dp[i-1][j-1] + 1; // match found
            } else {
                dp[i][j] = (dp[i-1][j] > dp[i][j-1]) ? dp[i-1][j] : dp[i][j-1]; // max of ignoring one char
            }
        }
    }

    // Length of LCS
    int index = dp[m][n];
    char lcs[index+1];
    lcs[index] = '\0';

    // Backtrack to find LCS string
    int i = m, j = n;
    while (i > 0 && j > 0) {
        if (X[i-1] == Y[j-1]) {
            lcs[index-1] = X[i-1];
            i--;
            j--;
            index--;
        } else if (dp[i-1][j] > dp[i][j-1]) {
            i--;
        } else {
            j--;
        }
    }

    printf("Longest Common Subsequence: %s\n", lcs);
}

int main() {
    char X[] = "ABCDGH";
    char Y[] = "AEDFHR";
    longestCommonSubsequence(X, Y);
    return 0;
}
if (i == 0 || j == 0) { dp[i][j] = 0; // base case: empty string }
initialize dp edges with zero for empty prefixes
else if (X[i-1] == Y[j-1]) { dp[i][j] = dp[i-1][j-1] + 1; // match found }
increase subsequence length when characters match
else { dp[i][j] = (dp[i-1][j] > dp[i][j-1]) ? dp[i-1][j] : dp[i][j-1]; // max of ignoring one char }
choose max subsequence length ignoring one character
while (i > 0 && j > 0) { if (X[i-1] == Y[j-1]) { lcs[index-1] = X[i-1]; i--; j--; index--; } else if (dp[i-1][j] > dp[i][j-1]) { i--; } else { j--; } }
backtrack dp matrix to reconstruct LCS string
OutputSuccess
Longest Common Subsequence: ADH
Complexity Analysis
Time: O(m*n) because we fill a matrix of size m by n once
Space: O(m*n) because we store subsequence lengths for all substring pairs
vs Alternative: Naive recursion without memoization is exponential O(2^(m+n)), so DP is much faster
Edge Cases
One or both strings empty
LCS length is zero and empty subsequence returned
DSA C
if (i == 0 || j == 0) {
    dp[i][j] = 0; // base case: empty string
}
Strings with no common characters
LCS length is zero and empty subsequence returned
DSA C
else {
    dp[i][j] = (dp[i-1][j] > dp[i][j-1]) ? dp[i-1][j] : dp[i][j-1]; // max of ignoring one char
}
Strings with all characters matching
LCS is the entire string
DSA C
else if (X[i-1] == Y[j-1]) {
    dp[i][j] = dp[i-1][j-1] + 1; // match found
}
When to Use This Pattern
When you need to find the longest sequence common to two sequences in order, use the Longest Common Subsequence pattern because it efficiently finds the maximum matching order.
Common Mistakes
Mistake: Not initializing the first row and column of dp matrix to zero
Fix: Explicitly set dp[i][0] and dp[0][j] to zero to handle empty substring cases
Mistake: Incorrectly indexing strings with dp indices (off by one)
Fix: Use X[i-1] and Y[j-1] when accessing characters because dp indices start from 1
Mistake: Not backtracking dp matrix to reconstruct the subsequence
Fix: Implement backtracking loop to build the LCS string from dp matrix
Summary
Finds the longest sequence of characters common to two strings in order.
Use when you want to compare two sequences and find their maximum ordered overlap.
The key insight is building a matrix of subsequence lengths and backtracking to get the sequence.