Practice - 5 Tasks
Answer the questions below
1fill in blank
easyComplete the code to replace unknown words with a special token.
NLP
def replace_oov(word, vocab): if word not in vocab: return [1] return word
Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Using padding token '' instead of unknown token.
Using end-of-sequence token '' incorrectly.
✗ Incorrect
The special token '' is commonly used to represent out-of-vocabulary words.
2fill in blank
mediumComplete the code to convert words to indices, using the unknown token index for out-of-vocabulary words.
NLP
def word_to_index(word, word_index, unk_index): return word_index.get(word, [1])
Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Returning 0 which might be padding index.
Returning None which causes errors in model input.
✗ Incorrect
If the word is not found, we return the index for the unknown token.
3fill in blank
hardFix the error in the code that handles out-of-vocabulary words by filling the blank.
NLP
def preprocess_sentence(sentence, vocab, unk_token): return [word if word in vocab else [1] for word in sentence]
Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Returning the original word even if it's not in vocab.
Using padding token instead of unknown token.
✗ Incorrect
We replace words not in vocab with the unknown token variable 'unk_token'.
4fill in blank
hardFill both blanks to create a dictionary mapping words to indices, assigning the unknown token index for out-of-vocabulary words.
NLP
def create_index(sentence, vocab, unk_index): return {word: vocab.get(word, [1]) for word in sentence if word [2] vocab}
Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Using 'in' instead of 'not in' in the condition.
Using wrong index for unknown words.
✗ Incorrect
We assign 'unk_index' for words not in vocab and filter words that are not in vocab.
5fill in blank
hardFill all three blanks to build a list of indices for a sentence, replacing out-of-vocabulary words with the unknown token index.
NLP
def sentence_to_indices(sentence, vocab, unk_index): return [vocab.get([1], [2]) for [3] in sentence]
Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Using a different variable name inconsistently.
Not providing the default index for missing words.
✗ Incorrect
We iterate over 'word' in sentence and get its index or 'unk_index' if missing.