Practice

(1/5)

1. What does the edit distance (Levenshtein distance) between two words measure?

easy

A. The length difference between two words

B. The minimum number of single-character edits to change one word into the other

C. The number of common letters between two words

D. The number of vowels in both words combined

Solution

Step 1: Understand the definition of edit distance
Edit distance counts how many changes like insertions, deletions, or substitutions are needed to convert one word into another.
Step 2: Compare options with the definition
Only The minimum number of single-character edits to change one word into the other correctly describes this minimum number of single-character edits.
Final Answer:
The minimum number of single-character edits to change one word into the other -> Option B
Quick Check:
Edit distance = minimum edits [OK]

Hint: Edit distance = smallest edits to match words [OK]

Common Mistakes:

Confusing edit distance with common letters count
Thinking it measures length difference only
Mixing up vowels or letter counts

2. Which of the following Python code snippets correctly initializes a 2D table for computing edit distance between strings s1 and s2 of lengths m and n?

easy

A. table = [[0] * (n + 1) for _ in range(m + 1)]

B. table = [[0] * m for _ in range(n)]

C. table = [[0] * (m + 1) for _ in range(n + 1)]

D. table = [[0] * n for _ in range(m)]

Solution

Step 1: Recall table size for edit distance
The table size must be (m+1) rows and (n+1) columns to include empty prefixes of both strings.
Step 2: Match code to correct dimensions
table = [[0] * (n + 1) for _ in range(m + 1)] creates a list with (m+1) rows and each row has (n+1) zeros, which is correct.
Final Answer:
table = [[0] * (n + 1) for _ in range(m + 1)] -> Option A
Quick Check:
Table size = (m+1) x (n+1) [OK]

Hint: Table size = (len(s1)+1) x (len(s2)+1) [OK]

Common Mistakes:

Swapping m and n in dimensions
Forgetting to add 1 for empty string prefix
Using wrong list comprehension order

3. What is the edit distance between the words "kitten" and "sitting"?

medium

A. 2

B. 4

C. 3

D. 5

Solution

Step 1: Identify edits from "kitten" to "sitting"
Changes: substitute 'k' -> 's', substitute 'e' -> 'i', insert 'g' at the end.
Step 2: Count total edits
There are 3 edits: 2 substitutions and 1 insertion.
Final Answer:
3 -> Option C
Quick Check:
Edits counted = 3 [OK]

Hint: Count substitutions + insertions + deletions [OK]

Common Mistakes:

Counting only substitutions
Missing the final insertion
Confusing similar letters as no change

4. Consider this Python code snippet for edit distance calculation. What is the error?

def edit_distance(s1, s2):
    m, n = len(s1), len(s2)
    table = [[0] * (n + 1) for _ in range(m + 1)]
    for i in range(m + 1):
        table[i][0] = i
    for j in range(n + 1):
        table[0][j] = j
    for i in range(1, m + 1):
        for j in range(1, n + 1):
            if s1[i] == s2[j]:
                cost = 0
            else:
                cost = 1
            table[i][j] = min(table[i-1][j] + 1, table[i][j-1] + 1, table[i-1][j-1] + cost)
    return table[m][n]

medium

A. The return statement should be table[0][0]

B. The table initialization is incorrect

C. The cost should be 2 when characters differ

D. Indexing s1[i] and s2[j] causes an off-by-one error

Solution

Step 1: Check string indexing in loops
Strings are 0-indexed, but loops start at 1, so s1[i] and s2[j] skip first character and cause index errors.
Step 2: Correct indexing
Use s1[i-1] and s2[j-1] to access correct characters for comparison.
Final Answer:
Indexing s1[i] and s2[j] causes an off-by-one error -> Option D
Quick Check:
Indexing error = s1[i-1], s2[j-1] needed [OK]

Hint: Remember strings start at index 0, loops at 1 need offset [OK]

Common Mistakes:

Using s1[i] instead of s1[i-1]
Thinking cost must be 2 for difference
Returning wrong table cell

5. You want to find the closest word to "flame" from the list ["frame", "flan", "flame", "blame"] using edit distance. Which word will the algorithm select as closest?

hard

A. "flame"

B. "frame"

C. "blame"

D. "flan"

Solution

Step 1: Calculate edit distances to each word
Distances: "flame" to "frame" = 1 (substitute 'l'->'r'), to "flan" = 2 (delete 'm' and 'e'), to "blame" = 1 (substitute 'f'->'b'), to "flame" = 0 (same word).
Step 2: Identify minimum distance
The smallest distance is 0 for "flame" itself, meaning exact match.
Final Answer:
"flame" -> Option A
Quick Check:
Exact match distance = 0 [OK]

Hint: Exact match has zero edit distance [OK]

Common Mistakes:

Choosing word with one letter difference over exact match
Ignoring exact matches
Miscounting substitutions

Why Edit distance (Levenshtein) in NLP? - Purpose & Use Cases

Start learning this pattern below

Practice

Solution

Step 1: Understand the definition of edit distance

Step 2: Compare options with the definition

Final Answer:

Quick Check:

Solution

Step 1: Recall table size for edit distance

Step 2: Match code to correct dimensions

Final Answer:

Quick Check:

Solution

Step 1: Identify edits from "kitten" to "sitting"

Step 2: Count total edits

Final Answer:

Quick Check:

Solution

Step 1: Check string indexing in loops

Step 2: Correct indexing

Final Answer:

Quick Check:

Solution

Step 1: Calculate edit distances to each word

Step 2: Identify minimum distance

Final Answer:

Quick Check: