0
0
NLPml~10 mins

BERT pre-training concept in NLP - Interactive Code Practice

Choose your learning style9 modes available
Practice - 5 Tasks
Answer the questions below
1fill in blank
easy

Complete the code to import the BERT tokenizer from the transformers library.

NLP
from transformers import [1]
Drag options to blanks, or click blank then click option'
ABertModel
BBertTokenizer
CAutoModel
DGPT2Tokenizer
Attempts:
3 left
💡 Hint
Common Mistakes
Importing the model class instead of the tokenizer.
Using a tokenizer from a different model like GPT-2.
2fill in blank
medium

Complete the code to create a masked language modeling label tensor where tokens to predict are marked with their IDs and others with -100.

NLP
labels = input_ids.clone()
labels[~masked_indices] = [1]
Drag options to blanks, or click blank then click option'
Ainput_ids
BNone
C0
D-100
Attempts:
3 left
💡 Hint
Common Mistakes
Setting unmasked tokens to 0 which is a valid token ID.
Leaving unmasked tokens unchanged.
3fill in blank
hard

Fix the error in the code that creates the attention mask for BERT input tokens.

NLP
attention_mask = (input_ids != [1]).long()
Drag options to blanks, or click blank then click option'
A0
B1
C-100
DNone
Attempts:
3 left
💡 Hint
Common Mistakes
Using 1 as the padding token ID.
Using None which causes errors.
4fill in blank
hard

Fill both blanks to create a dictionary for masked language modeling inputs and labels.

NLP
inputs = {
    'input_ids': [1],
    'labels': [2]
}
Drag options to blanks, or click blank then click option'
Ainput_ids
Blabels
Cattention_mask
Dtoken_type_ids
Attempts:
3 left
💡 Hint
Common Mistakes
Confusing attention_mask or token_type_ids as labels.
Swapping input_ids and labels.
5fill in blank
hard

Fill the blank to compute the masked language modeling loss using the model outputs and labels.

NLP
from torch.nn import CrossEntropyLoss

loss_fct = CrossEntropyLoss()
logits = outputs.logits
masked_lm_loss = loss_fct(logits.view(-1, logits.size([1])), labels.view(-1))
Drag options to blanks, or click blank then click option'
A2
B0
C3
D1
Attempts:
3 left
💡 Hint
Common Mistakes
Using the wrong dimension index causing shape mismatch.
Not reshaping logits and labels properly.