Practice - 5 Tasks
Answer the questions below
1fill in blank
easyComplete the code to load the T5 tokenizer.
NLP
from transformers import T5Tokenizer tokenizer = T5Tokenizer.from_pretrained('[1]')
Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Using tokenizer names from other models like 'bert-base-uncased' causes errors.
Misspelling the model name.
✗ Incorrect
The T5 tokenizer is loaded using the model name 't5-small'. Other options are for different models.
2fill in blank
mediumComplete the code to prepare input text for the T5 model.
NLP
input_text = "translate English to German: The house is wonderful." inputs = tokenizer('[1]', return_tensors='pt')
Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Passing raw strings instead of the variable.
Using output text instead of input text.
✗ Incorrect
We pass the variable 'input_text' to the tokenizer to prepare inputs for the model.
3fill in blank
hardFix the error in generating output tokens from the model.
NLP
outputs = model.generate([1].input_ids) Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Passing raw text instead of token IDs.
Using the tokenizer object instead of inputs.
✗ Incorrect
The model expects input IDs from the tokenized inputs, which are stored in 'inputs'.
4fill in blank
hardFill both blanks to decode the output tokens into text.
NLP
decoded_output = tokenizer.[1](outputs[0], skip_special_tokens=[2])
Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Using 'encode' instead of 'decode'.
Setting 'skip_special_tokens' to False causing extra tokens in output.
✗ Incorrect
We use 'decode' to convert tokens back to text and set 'skip_special_tokens=True' to remove special tokens.
5fill in blank
hardFill all three blanks to complete the T5 translation pipeline.
NLP
from transformers import T5ForConditionalGeneration, T5Tokenizer tokenizer = T5Tokenizer.from_pretrained('[1]') model = T5ForConditionalGeneration.from_pretrained('[2]') input_text = "translate English to French: I love machine learning." inputs = tokenizer(input_text, return_tensors='pt') outputs = model.generate([3].input_ids) translated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Mixing model names like 'bert-base-uncased' or 'gpt2' with T5 code.
Passing raw text instead of token IDs to model.generate.
✗ Incorrect
Both tokenizer and model are loaded with 't5-small'. The tokenized inputs are stored in 'inputs' used for generation.