Practice - 5 Tasks
Answer the questions below
1fill in blank
easyComplete the code to load a pre-trained translation model from Hugging Face.
NLP
from transformers import MarianMTModel, MarianTokenizer model_name = 'Helsinki-NLP/opus-mt-en-de' tokenizer = MarianTokenizer.from_pretrained([1])
Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Using a language model like 'bert-base-uncased' instead of a translation model.
Confusing speech models with translation models.
✗ Incorrect
We use the exact model name 'Helsinki-NLP/opus-mt-en-de' to load the tokenizer for English to German translation.
2fill in blank
mediumComplete the code to tokenize the input text for translation.
NLP
text = 'Hello, how are you?' tokenized = tokenizer([1], return_tensors='pt', padding=True)
Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Passing a list of words instead of a string.
Passing the tokenizer object instead of the text.
✗ Incorrect
We pass the variable 'text' containing the input string to the tokenizer.
3fill in blank
hardFix the error in generating the translated tokens.
NLP
translated_tokens = model.generate([1].input_ids, max_length=40)
Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Passing raw text instead of token IDs.
Passing the tokenizer or model object instead of token IDs.
✗ Incorrect
We use 'tokenized.input_ids' as input to the model's generate function to produce translated tokens.
4fill in blank
hardFill both blanks to decode the translated tokens into readable text.
NLP
translated_text = tokenizer.[1](translated_tokens[0], skip_special_tokens=[2])
Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Using 'encode' instead of 'decode'.
Not skipping special tokens, resulting in unwanted symbols.
✗ Incorrect
We use 'decode' to convert tokens back to text and skip_special_tokens=True to remove special tokens.
5fill in blank
hardFill all three blanks to create a function that translates English text to German.
NLP
def translate_en_to_de(text): inputs = tokenizer(text, return_tensors=[1], padding=True) outputs = model.generate(inputs.[2], max_length=50) return tokenizer.[3](outputs[0], skip_special_tokens=True)
Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Using 'text' instead of 'input_ids' for model input.
Returning tokens without decoding.
Using wrong tensor type like 'tf' instead of 'pt'.
✗ Incorrect
We use 'pt' for PyTorch tensors, 'input_ids' for model input, and 'decode' to get the translated text.