Why do we fine-tune a pre-trained question answering (QA) model on a custom dataset?
Think about why a model trained on general data might not perform well on your specific questions.
Fine-tuning adjusts the model's knowledge to the specific language, domain, and style of your custom dataset, improving its accuracy on your tasks.
What will be the printed output after running this training loop snippet for 1 epoch?
for epoch in range(1): total_loss = 0 for batch in [[{'input_ids': [1,2], 'labels': [1,2]}], [{'input_ids': [3,4], 'labels': [3,4]}]]: loss = sum(batch[0]['input_ids']) * 0.1 total_loss += loss print(f"Epoch {epoch+1} loss: {total_loss:.2f}")
Calculate the sum of input_ids for each batch and multiply by 0.1, then add them.
First batch: sum([1,2])=3 * 0.1 = 0.3; second batch: sum([3,4])=7 * 0.1 = 0.7; total_loss=1.0, formatted as 1.00.
You are fine-tuning a QA model on a small dataset. Which learning rate is most appropriate to avoid overfitting and unstable training?
Think about how large learning rates affect small datasets and model stability.
Very small learning rates like 1e-5 help fine-tune pre-trained models gently, preventing overfitting and unstable updates on small datasets.
Which metric best measures the quality of answers generated by a fine-tuned extractive QA model?
Consider metrics that compare predicted answer spans to ground truth answers exactly.
Exact Match measures the percentage of predictions that exactly match the correct answer span, making it ideal for extractive QA evaluation.
Given this snippet from a fine-tuning script, what error will it raise?
from transformers import AutoModelForQuestionAnswering model = AutoModelForQuestionAnswering.from_pretrained('bert-base-uncased') inputs = {'input_ids': [101, 2009, 2003, 1037, 2204, 2154, 102], 'attention_mask': [1,1,1,1,1,1,1]} outputs = model(**inputs) print(outputs.loss)
Check what inputs the model expects during training vs inference.
Without start_positions and end_positions, the model performs inference, returning logits with loss=None. No error; prints None.