What is the primary effect of pruning a convolutional neural network model?
Think about how pruning affects model size and speed.
Pruning removes less important weights, reducing model size and speeding up inference without changing the architecture.
Given a floating-point weight tensor [0.9, -1.2, 0.3, 2.7], what is the output after 8-bit symmetric quantization with scale 0.1 and zero-point 0?
import numpy as np weights = np.array([0.9, -1.2, 0.3, 2.7]) scale = 0.1 zero_point = 0 quantized = np.round(weights / scale) + zero_point quantized = quantized.astype(np.int8) print(quantized.tolist())
Divide each weight by the scale and round to nearest integer.
Each weight is divided by 0.1 and rounded: 0.9/0.1=9, -1.2/0.1=-12, etc.
You want to prune a deep CNN model to reduce size but keep accuracy loss under 2%. Which pruning percentage is most reasonable to start with?
Start with a small pruning amount to avoid big accuracy drops.
Pruning 10% is a safe start to reduce size with minimal accuracy loss; pruning 90% is too aggressive and likely harms accuracy.
After quantizing a model, you observe the accuracy dropped from 92% to 89%. What metric best describes this change?
Calculate the difference between original and quantized accuracy.
The accuracy dropped by 3 points (92% - 89% = 3%).
Consider this pruning code snippet for a PyTorch model:
for name, param in model.named_parameters():
if 'weight' in name:
mask = (param.abs() > threshold)
param.data = param.data * maskWhat is the cause of the runtime error?
Look for undefined variables in the code.
The threshold variable is not defined before use, causing a NameError.