In a fine-tuning approach for a convolutional neural network, which layers are typically retrained to adapt the model to a new task?
Think about which parts of the model capture general features versus task-specific features.
Fine-tuning usually involves freezing early layers that capture general features and retraining the final layers to adapt to the new task.
Given a pretrained CNN model with an output layer of size 1000 classes, you replace the output layer with a new layer of size 10 classes for fine-tuning. What will be the output shape of the model for a batch of 32 images?
import torch import torch.nn as nn class PretrainedModel(nn.Module): def __init__(self): super().__init__() self.features = nn.Sequential(nn.Conv2d(3, 16, 3), nn.ReLU()) self.classifier = nn.Linear(16 * 30 * 30, 1000) def forward(self, x): x = self.features(x) x = x.view(x.size(0), -1) x = self.classifier(x) return x model = PretrainedModel() model.classifier = nn.Linear(16 * 30 * 30, 10) # Replace output layer input_tensor = torch.randn(32, 3, 32, 32) output = model(input_tensor) output_shape = output.shape
Remember the batch size is the first dimension in PyTorch tensors.
The output shape is (batch_size, number_of_classes). After replacing the output layer to 10 classes, the output shape for 32 images is (32, 10).
When fine-tuning a pretrained model, which learning rate strategy is generally recommended?
Think about how pretrained weights should be adjusted carefully.
A smaller learning rate helps preserve useful pretrained features while allowing gradual adaptation to new data.
After fine-tuning a model on a new dataset, which metric would best indicate if the model has successfully adapted without overfitting?
Consider what it means when training and validation metrics are similar.
High and similar training and validation accuracy indicate good generalization and successful fine-tuning.
Consider this PyTorch code snippet for fine-tuning a pretrained model. What error will it raise?
import torch import torch.nn as nn from torchvision import models model = models.resnet18(pretrained=True) for param in model.parameters(): param.requires_grad = False model.fc = nn.Linear(model.fc.in_features, 5) optimizer = torch.optim.SGD(model.parameters(), lr=0.01) # Training loop omitted # Attempt to update only trainable parameters optimizer.zero_grad() loss = torch.tensor(1.0, requires_grad=True) loss.backward() optimizer.step()
Think about how PyTorch optimizers handle parameters with requires_grad=False.
PyTorch optimizers skip parameters where requires_grad=False, so no error occurs during optimizer.step().