0
0
NLPml~10 mins

RoBERTa and DistilBERT in NLP - Interactive Code Practice

Choose your learning style9 modes available
Practice - 5 Tasks
Answer the questions below
1fill in blank
easy

Complete the code to load the RoBERTa tokenizer.

NLP
from transformers import [1]
tokenizer = [1].from_pretrained('roberta-base')
Drag options to blanks, or click blank then click option'
ABertTokenizer
BDistilBertTokenizer
CRobertaTokenizer
DGPT2Tokenizer
Attempts:
3 left
💡 Hint
Common Mistakes
Using BertTokenizer instead of RobertaTokenizer
Using DistilBertTokenizer for RoBERTa
2fill in blank
medium

Complete the code to load the DistilBERT model for sequence classification.

NLP
from transformers import [1]
model = [1].from_pretrained('distilbert-base-uncased-finetuned-sst-2-english')
Drag options to blanks, or click blank then click option'
ABertForSequenceClassification
BDistilBertForSequenceClassification
CGPT2ForSequenceClassification
DRobertaForSequenceClassification
Attempts:
3 left
💡 Hint
Common Mistakes
Using RobertaForSequenceClassification for DistilBERT
Using GPT2ForSequenceClassification which is unrelated
3fill in blank
hard

Fix the error in the code to tokenize input text using RoBERTa tokenizer.

NLP
inputs = tokenizer([1], return_tensors='pt')
Drag options to blanks, or click blank then click option'
A'Hello, how are you?'
B12345
Cb'Hello, how are you?'
D['Hello, how are you?']
Attempts:
3 left
💡 Hint
Common Mistakes
Passing a list instead of a string
Passing bytes instead of string
Passing an integer
4fill in blank
hard

Fill both blanks to create a dictionary comprehension that maps tokens to their IDs using DistilBERT tokenizer.

NLP
token_ids = {token: [1] for token, [2] in tokenizer.get_vocab().items()}
Drag options to blanks, or click blank then click option'
Aid
Btoken_id
Cid_
Dtokenid
Attempts:
3 left
💡 Hint
Common Mistakes
Using different variable names for the token ID
Using undefined variable names
5fill in blank
hard

Fill all three blanks to prepare inputs, run the model, and get logits using RoBERTa.

NLP
inputs = tokenizer([1], return_tensors=[2])
outputs = model([3])
logits = outputs.logits
Drag options to blanks, or click blank then click option'
A'This is a test sentence.'
B'pt'
Cinputs['input_ids']
D'tf'
Attempts:
3 left
💡 Hint
Common Mistakes
Passing the whole inputs dict instead of input IDs
Using TensorFlow tensor type 'tf' when model expects PyTorch
Passing a list instead of a string to tokenizer