Practice - 5 Tasks
Answer the questions below
1fill in blank
easyComplete the code to load the BERT tokenizer.
NLP
from transformers import [1] tokenizer = [1].from_pretrained('bert-base-uncased')
Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Using BertModel instead of a tokenizer.
Forgetting to import the tokenizer class.
Using BertTokenizer which is less flexible than AutoTokenizer.
✗ Incorrect
We use AutoTokenizer to load the BERT tokenizer easily by specifying the model name.
2fill in blank
mediumComplete the code to tokenize input texts with padding and truncation.
NLP
inputs = tokenizer(texts, padding=[1], truncation=[1], return_tensors='pt')
Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Setting padding or truncation to False causes errors with varying input lengths.
Using 'max_length' string instead of boolean True.
✗ Incorrect
Setting padding and truncation to True ensures all inputs are the same length for the model.
3fill in blank
hardFix the error in the model definition by completing the missing class name.
NLP
from transformers import [1] model = [1].from_pretrained('bert-base-uncased', num_labels=2)
Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Using BertModel which lacks classification head.
Using AutoTokenizer instead of a model class.
✗ Incorrect
BertForSequenceClassification is the correct class for fine-tuning BERT on classification tasks.
4fill in blank
hardFill both blanks to prepare the optimizer and learning rate scheduler.
NLP
from transformers import [1], [2] optimizer = [1](model.parameters(), lr=2e-5) scheduler = [2](optimizer, num_warmup_steps=0, num_training_steps=100)
Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Using SGD optimizer which is less effective for BERT.
Using StepLR scheduler which is not typical for transformers.
✗ Incorrect
AdamW is the recommended optimizer and get_linear_schedule_with_warmup is used for learning rate scheduling in BERT fine-tuning.
5fill in blank
hardFill all three blanks to compute accuracy during evaluation.
NLP
from sklearn.metrics import [1] predictions = outputs.logits.argmax(dim=[2]).cpu().numpy() labels = batch['labels'].cpu().numpy() acc = [3](labels, predictions)
Drag options to blanks, or click blank then click option'
Attempts:
3 left
💡 Hint
Common Mistakes
Using f1_score without importing or for simple accuracy.
Using argmax dim 0 which selects wrong axis.
✗ Incorrect
accuracy_score computes accuracy; argmax is done on dimension 1 (index 1) for class predictions.