Challenge - 5 Problems
Edit Distance Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
❓ Predict Output
intermediate1:30remaining
Output of Levenshtein distance function
What is the output of this Python function call?
```python def levenshtein(s1, s2): if len(s1) < len(s2): return levenshtein(s2, s1) if len(s2) == 0: return len(s1) previous_row = range(len(s2) + 1) for i, c1 in enumerate(s1): current_row = [i + 1] for j, c2 in enumerate(s2): insertions = previous_row[j + 1] + 1 deletions = current_row[j] + 1 substitutions = previous_row[j] + (c1 != c2) current_row.append(min(insertions, deletions, substitutions)) previous_row = current_row return previous_row[-1] print(levenshtein("kitten", "sitting")) ```
```python def levenshtein(s1, s2): if len(s1) < len(s2): return levenshtein(s2, s1) if len(s2) == 0: return len(s1) previous_row = range(len(s2) + 1) for i, c1 in enumerate(s1): current_row = [i + 1] for j, c2 in enumerate(s2): insertions = previous_row[j + 1] + 1 deletions = current_row[j] + 1 substitutions = previous_row[j] + (c1 != c2) current_row.append(min(insertions, deletions, substitutions)) previous_row = current_row return previous_row[-1] print(levenshtein("kitten", "sitting")) ```
NLP
def levenshtein(s1, s2): if len(s1) < len(s2): return levenshtein(s2, s1) if len(s2) == 0: return len(s1) previous_row = range(len(s2) + 1) for i, c1 in enumerate(s1): current_row = [i + 1] for j, c2 in enumerate(s2): insertions = previous_row[j + 1] + 1 deletions = current_row[j] + 1 substitutions = previous_row[j] + (c1 != c2) current_row.append(min(insertions, deletions, substitutions)) previous_row = current_row return previous_row[-1] print(levenshtein("kitten", "sitting"))
Attempts:
2 left
💡 Hint
Count the minimum edits needed to change 'kitten' into 'sitting'.
✗ Incorrect
The Levenshtein distance between 'kitten' and 'sitting' is 3: substitute 'k'->'s', substitute 'e'->'i', insert 'g' at the end.
🧠 Conceptual
intermediate1:00remaining
Understanding Levenshtein distance properties
Which of the following statements about Levenshtein distance is TRUE?
Attempts:
2 left
💡 Hint
Think about what the distance measures and if it can be negative.
✗ Incorrect
Levenshtein distance counts the minimum number of insertions, deletions, and substitutions to transform one string into another, so it is always zero or positive.
❓ Hyperparameter
advanced1:30remaining
Choosing weights in weighted Levenshtein distance
In a weighted Levenshtein distance, you assign different costs to insertions, deletions, and substitutions. If you want to penalize substitutions more than insertions and deletions, which weight setting is correct?
Attempts:
2 left
💡 Hint
Higher cost means more penalty.
✗ Incorrect
To penalize substitutions more, assign a higher cost to substitutions than insertions or deletions.
🔧 Debug
advanced1:30remaining
Identify the error in Levenshtein distance code
What error will this code raise when called with levenshtein('a', 'b')?
```python def levenshtein(s1, s2): if len(s1) < len(s2): return levenshtein(s2, s1) if len(s2) == 0: return len(s1) previous_row = [0] * len(s2) for i, c1 in enumerate(s1): current_row = [i + 1] for j, c2 in enumerate(s2): insertions = previous_row[j + 1] + 1 deletions = current_row[j] + 1 substitutions = previous_row[j] + (c1 != c2) current_row.append(min(insertions, deletions, substitutions)) previous_row = current_row return previous_row[-1] ```
```python def levenshtein(s1, s2): if len(s1) < len(s2): return levenshtein(s2, s1) if len(s2) == 0: return len(s1) previous_row = [0] * len(s2) for i, c1 in enumerate(s1): current_row = [i + 1] for j, c2 in enumerate(s2): insertions = previous_row[j + 1] + 1 deletions = current_row[j] + 1 substitutions = previous_row[j] + (c1 != c2) current_row.append(min(insertions, deletions, substitutions)) previous_row = current_row return previous_row[-1] ```
NLP
def levenshtein(s1, s2): if len(s1) < len(s2): return levenshtein(s2, s1) if len(s2) == 0: return len(s1) previous_row = [0] * len(s2) for i, c1 in enumerate(s1): current_row = [i + 1] for j, c2 in enumerate(s2): insertions = previous_row[j + 1] + 1 deletions = current_row[j] + 1 substitutions = previous_row[j] + (c1 != c2) current_row.append(min(insertions, deletions, substitutions)) previous_row = current_row return previous_row[-1] levenshtein('a', 'b')
Attempts:
2 left
💡 Hint
Check the length of previous_row and how indexes are accessed.
✗ Incorrect
previous_row is initialized with length len(s2), but code accesses previous_row[j + 1], which causes IndexError when j is last index.
❓ Model Choice
expert2:00remaining
Best model for approximate string matching using edit distance
You want to build a system that suggests corrections for misspelled words by finding dictionary words with smallest edit distance. Which model or approach is best suited for this task?
Attempts:
2 left
💡 Hint
Think about efficient search for words within a certain edit distance.
✗ Incorrect
BK-trees are designed for efficient approximate matching using edit distance metrics, making them ideal for spell correction tasks.