Complete the code to load the BERT tokenizer.
from transformers import [1] tokenizer = [1].from_pretrained('bert-base-uncased')
The AutoTokenizer class loads the correct tokenizer for BERT automatically.
Complete the code to tokenize input text for BERT.
inputs = tokenizer('[1]', return_tensors='pt', padding=True, truncation=True)
Any sentence can be tokenized, but here we want to fill the blank with a sample sentence. 'This is a test sentence.' is a clear example.
Fix the error in the code to get BERT's pooled output for classification.
outputs = model(**inputs)
pooled_output = outputs.[1]The pooler_output is the pooled representation of the [CLS] token used for classification.
Fill both blanks to define a simple classifier head on top of BERT's pooled output.
import torch.nn as nn class BertClassifier(nn.Module): def __init__(self, bert_model): super().__init__() self.bert = bert_model self.classifier = nn.[1](bert_model.config.hidden_size, [2]) def forward(self, input_ids, attention_mask): outputs = self.bert(input_ids=input_ids, attention_mask=attention_mask) pooled_output = outputs.pooler_output return self.classifier(pooled_output)
The classifier is a Linear layer mapping from BERT's hidden size to the number of classes, here 2.
Fill all three blanks to compute the accuracy metric after predictions.
import torch preds = torch.argmax(logits, dim=[1]) correct = (preds == labels).sum().item() accuracy = correct / [2] print(f'Accuracy: {accuracy:.2f}') # Assuming labels is a tensor of size [3]
We take argmax over dimension 1 to get the predicted class for each sample in the batch, then divide correct count by batch size using labels.size(0).