Python Program to Find Edit Distance Between Strings
insert, delete, and replace operations; for example, use the function def edit_distance(s1, s2): with a nested loop to compute the distance.Examples
How to Think About It
Algorithm
Code
def edit_distance(s1, s2): m, n = len(s1), len(s2) dp = [[0] * (n + 1) for _ in range(m + 1)] for i in range(m + 1): dp[i][0] = i for j in range(n + 1): dp[0][j] = j for i in range(1, m + 1): for j in range(1, n + 1): if s1[i - 1] == s2[j - 1]: dp[i][j] = dp[i - 1][j - 1] else: dp[i][j] = 1 + min(dp[i - 1][j], dp[i][j - 1], dp[i - 1][j - 1]) return dp[m][n] print(edit_distance("kitten", "sitting"))
Dry Run
Let's trace the example 'kitten' and 'sitting' through the code
Initialize dp table
Create a 7x8 table (lengths + 1) with first row 0 to 7 and first column 0 to 6
Fill dp table
Compare characters one by one, fill dp[i][j] with minimum edits considering insert, delete, replace
Final result
dp[6][7] = 3 means minimum 3 edits to convert 'kitten' to 'sitting'
| i | j | s1[i-1] | s2[j-1] | dp[i][j] |
|---|---|---|---|---|
| 1 | 1 | k | s | 1 |
| 2 | 2 | i | i | 1 |
| 3 | 3 | t | t | 1 |
| 4 | 4 | t | t | 1 |
| 5 | 5 | e | i | 2 |
| 6 | 6 | n | n | 2 |
| 6 | 7 | n | g | 3 |
Why This Works
Step 1: Setup the table
We create a table where each cell represents the minimum edits needed to convert substrings of the two strings up to that point.
Step 2: Compare characters
If characters are the same, no new edit is needed, so we take the diagonal value; if different, we consider insert, delete, or replace.
Step 3: Choose minimum edits
We pick the smallest number of edits from the three possible operations plus one to update the current cell.
Alternative Approaches
def edit_distance_recursive(s1, s2, i=0, j=0, memo=None): if memo is None: memo = {} if (i, j) in memo: return memo[(i, j)] if i == len(s1): return len(s2) - j if j == len(s2): return len(s1) - i if s1[i] == s2[j]: memo[(i, j)] = edit_distance_recursive(s1, s2, i+1, j+1, memo) else: insert = 1 + edit_distance_recursive(s1, s2, i, j+1, memo) delete = 1 + edit_distance_recursive(s1, s2, i+1, j, memo) replace = 1 + edit_distance_recursive(s1, s2, i+1, j+1, memo) memo[(i, j)] = min(insert, delete, replace) return memo[(i, j)] print(edit_distance_recursive("kitten", "sitting"))
import difflib def edit_distance_difflib(s1, s2): seq = difflib.SequenceMatcher(None, s1, s2) return int(round((1 - seq.ratio()) * max(len(s1), len(s2)))) print(edit_distance_difflib("kitten", "sitting"))
Complexity: O(m*n) time, O(m*n) space
Time Complexity
The algorithm fills a table of size m by n, where m and n are the lengths of the two strings, so it runs in O(m*n) time.
Space Complexity
The 2D table requires O(m*n) space to store intermediate results; space can be optimized but at cost of complexity.
Which Approach is Fastest?
Dynamic programming is fastest for exact results; recursive without memo is slow; difflib is faster but approximate.
| Approach | Time | Space | Best For |
|---|---|---|---|
| Dynamic Programming | O(m*n) | O(m*n) | Exact results, medium to large strings |
| Recursive with Memoization | O(m*n) | O(m*n) | Conceptual clarity, smaller inputs |
| difflib Approximation | O(m+n) | O(1) | Quick similarity checks, approximate results |