Which statement best describes the purpose of pre-training in machine learning models?
Think about how a model learns general knowledge before focusing on a specific problem.
Pre-training helps the model learn broad features from a large dataset, which can then be fine-tuned for a specific task.
You want to build a sentiment analysis tool using a large language model. Which model type is best suited for fine-tuning on your specific dataset?
Consider models that have already learned language patterns and can be adapted.
Pre-trained transformer models have learned language features and can be fine-tuned efficiently for tasks like sentiment analysis.
What will be the output of the training accuracy after fine-tuning this simple model for 3 epochs?
import torch import torch.nn as nn import torch.optim as optim class SimpleModel(nn.Module): def __init__(self): super().__init__() self.linear = nn.Linear(2, 2) def forward(self, x): return self.linear(x) model = SimpleModel() criterion = nn.CrossEntropyLoss() optimizer = optim.SGD(model.parameters(), lr=0.1) # Dummy data: inputs and labels inputs = torch.tensor([[1.0, 2.0], [2.0, 1.0], [1.5, 1.5], [3.0, 3.0]]) labels = torch.tensor([0, 1, 0, 1]) for epoch in range(3): optimizer.zero_grad() outputs = model(inputs) loss = criterion(outputs, labels) loss.backward() optimizer.step() _, predicted = torch.max(outputs, 1) correct = (predicted == labels).sum().item() accuracy = correct / labels.size(0) print(f"Accuracy after 3 epochs: {accuracy:.2f}")
Check the model output and compare predicted labels to true labels after training.
The simple linear model trained for 3 epochs on this small dataset achieves 50% accuracy due to limited training and model capacity.
During fine-tuning a pre-trained model, what is the typical effect of using a learning rate that is too high?
Think about how big steps in learning affect the model's ability to settle on good solutions.
A too high learning rate can cause the model to jump around the solution space, preventing stable convergence.
You fine-tuned a large pre-trained model on a small dataset but the validation accuracy is very low and training loss does not improve. Which issue is most likely causing this?
Consider what happens when training a complex model on limited data with aggressive settings.
A high learning rate can prevent the model from properly adjusting weights during fine-tuning, leading to poor accuracy and no loss improvement.