Complete the code to apply dynamic quantization to a PyTorch model.
import torch import torch.nn as nn model = nn.Linear(10, 5) quantized_model = torch.quantization.[1](model, {nn.Linear})
Dynamic quantization is applied using torch.quantization.quantize_dynamic for modules like nn.Linear.
Complete the code to prune 20% of the weights in the first linear layer using L1 unstructured pruning.
import torch.nn.utils.prune as prune prune.l1_unstructured(model.[1], name='weight', amount=0.2)
Pruning targets the weight parameter of the layer, so model.weight is used.
Fix the error in the code to remove pruning reparameterization from the linear layer.
prune.[1](model.linear, 'weight')
The correct function to remove pruning is prune.remove.
Fill both blanks to apply global unstructured pruning to two layers with 30% sparsity.
parameters_to_prune = [(model.fc1, 'weight'), (model.fc2, 'weight')] prune.[1](parameters_to_prune, pruning_method=prune.[2], amount=0.3)
Global pruning is done with global_unstructured and the pruning method is l1_unstructured.
Fill all three blanks to create a quantized model, prepare it for static quantization, and convert it.
model.qconfig = torch.quantization.get_default_[1]_qconfig('fbgemm') torch.quantization.[2](model, inplace=True) quantized_model = torch.quantization.[3](model)
For static quantization, set qconfig, then prepare the model, and finally convert it.