Which of the following statements best describes the main effect of applying post-training quantization to a PyTorch model?
Think about how quantization changes the data type of weights and its effect on storage.
Quantization reduces model size by changing weights from 32-bit floats to smaller integer types like 8-bit integers, which saves memory and can speed up inference with little accuracy loss.
Consider the following PyTorch code that applies pruning to a linear layer. What will be the number of zero weights in model.fc.weight after pruning?
import torch import torch.nn as nn import torch.nn.utils.prune as prune class SimpleModel(nn.Module): def __init__(self): super().__init__() self.fc = nn.Linear(10, 5) model = SimpleModel() prune.l1_unstructured(model.fc, name='weight', amount=0.4) zero_weights = torch.sum(model.fc.weight == 0).item() print(zero_weights)
Calculate total weights and 40% of them.
The linear layer has 10 inputs and 5 outputs, so 50 weights total. 40% of 50 is 20, so 20 weights are zeroed by pruning.
You want to prune a PyTorch model to reduce size but keep accuracy loss under 2%. Which pruning amount is most likely to meet this goal?
Smaller pruning amounts usually preserve accuracy better.
Pruning only 5% of weights with L1 unstructured pruning removes the least important weights, minimizing accuracy loss under 2%. Larger amounts or random pruning risk higher loss.
After applying dynamic quantization to a PyTorch LSTM model, you observe the following accuracies on the test set:
- Original model accuracy: 92%
- Quantized model accuracy: 89%
What is the best interpretation of this result?
Quantization trades off some accuracy for efficiency.
Dynamic quantization reduces model size and speeds up inference but can cause a small accuracy drop, which is normal and expected.
What error will the following PyTorch code raise when trying to prune a model's convolutional layer?
import torch import torch.nn as nn import torch.nn.utils.prune as prune class ConvModel(nn.Module): def __init__(self): super().__init__() self.conv = nn.Conv2d(3, 16, 3, bias=False) model = ConvModel() prune.l1_unstructured(model.conv, name='bias', amount=0.3)
Check if the layer has a bias parameter by default.
The Conv2d layer by default has a bias parameter, but if bias=False is set, it does not. If bias is None, pruning 'bias' raises ValueError because the parameter does not exist.