Consider the following PyTorch code that freezes some layers of a model. What will be the value of requires_grad for each parameter after running this code?
import torch import torch.nn as nn class SimpleModel(nn.Module): def __init__(self): super().__init__() self.fc1 = nn.Linear(10, 5) self.fc2 = nn.Linear(5, 2) def forward(self, x): x = self.fc1(x) x = self.fc2(x) return x model = SimpleModel() # Freeze fc1 layer for param in model.fc1.parameters(): param.requires_grad = False requires_grad_list = [param.requires_grad for param in model.parameters()] print(requires_grad_list)
Remember that each Linear layer has two parameters: weights and biases.
The fc1 layer has two parameters (weight and bias), both set to requires_grad=False. The fc2 layer's parameters remain trainable (requires_grad=True).
You want to freeze all convolutional layers in a PyTorch model but keep other layers trainable. Which code snippet correctly achieves this?
Parameters themselves are tensors, not layers. You need to access modules to check their type.
Option D correctly iterates over modules, checks if they are convolutional layers, then freezes their parameters. Option D only checks parameter names, which may be unreliable. Option D tries to check parameter type, which is invalid. Option D tries to set requires_grad on layers, which is not a valid attribute.
You freeze some layers in your PyTorch model by setting requires_grad=False for their parameters. What should you do with the optimizer to avoid updating frozen parameters?
Optimizer updates parameters it receives. Frozen parameters should not be included.
Only parameters with requires_grad=True should be passed to the optimizer. Passing frozen parameters wastes memory and may cause errors.
You froze some layers by setting requires_grad=False, but after training, those layers' weights changed. What is the most likely cause?
Freezing layers disables gradient computation, but optimizer updates parameters it receives.
If frozen parameters are still passed to the optimizer, their values can be updated by optimizer steps even if gradients are zero or None.
When using transfer learning, freezing early layers of a pretrained model is common. What is the main reason for freezing these layers?
Think about what early layers learn in deep networks and why retraining them might be unnecessary.
Early layers learn general features like edges and textures useful across tasks. Freezing them reduces training time and risk of overfitting on small datasets.