Jump into concepts and practice - no test required
or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Recall & Review
beginner
What is model pruning in machine learning?
Model pruning is a technique that removes less important parts of a neural network, like some connections or neurons, to make the model smaller and faster without losing much accuracy.
Click to reveal answer
beginner
Explain quantization in the context of neural networks.
Quantization reduces the precision of the numbers used in a model, for example changing 32-bit floats to 8-bit integers, which makes the model smaller and faster to run, especially on devices with limited resources.
Click to reveal answer
intermediate
How does pruning help improve model performance?
Pruning removes unnecessary parts of the model, which reduces its size and speeds up predictions, making it easier to run on devices with less memory or slower processors.
Click to reveal answer
intermediate
What is a common trade-off when applying quantization to a model?
The trade-off is between model size and speed versus accuracy. Quantization makes the model smaller and faster but can slightly reduce its accuracy due to lower number precision.
Click to reveal answer
advanced
Name two common types of pruning used in model optimization.
Two common types are: 1) Weight pruning, which removes individual connections with small weights, and 2) Structured pruning, which removes entire neurons or filters to simplify the model structure.
Click to reveal answer
What does pruning mainly remove from a neural network?
ATraining data samples
BOutput layers
CInput features
DLess important connections or neurons
✗ Incorrect
Pruning removes less important connections or neurons to reduce model size and speed up inference.
Quantization typically changes model numbers from:
AIntegers to floats
B32-bit floats to 8-bit integers
C8-bit integers to 32-bit floats
DStrings to numbers
✗ Incorrect
Quantization reduces precision, often converting 32-bit floating-point numbers to 8-bit integers.
Which is a benefit of model pruning?
ASpeeds up model inference
BRequires more memory
CAdds more layers
DIncreases model size
✗ Incorrect
Pruning reduces model size and speeds up inference by removing unnecessary parts.
What is a possible downside of quantization?
AModel becomes slower
BModel uses more memory
CModel accuracy may slightly decrease
DModel requires more training data
✗ Incorrect
Quantization can slightly reduce accuracy due to lower number precision.
Structured pruning removes:
AEntire neurons or filters
BIndividual weights only
CTraining samples
DInput features
✗ Incorrect
Structured pruning removes whole neurons or filters to simplify the model.
Describe how pruning and quantization help optimize a computer vision model for deployment on mobile devices.
Think about how smaller and faster models help on phones.
You got /3 concepts.
Explain the trade-offs involved when applying pruning and quantization to a neural network.
Consider what you lose and gain with these techniques.
You got /4 concepts.
Practice
(1/5)
1. What is the main goal of model pruning in computer vision?
easy
A. To remove less important parts of the model to reduce size
B. To increase the number of layers in the model
C. To add more training data for better accuracy
D. To convert the model to a different programming language
Solution
Step 1: Understand pruning concept
Pruning means removing parts of the model that contribute less to its output.
Step 2: Identify pruning goal
The goal is to reduce model size and speed up inference by cutting unnecessary parts.
Final Answer:
To remove less important parts of the model to reduce size -> Option A
Quick Check:
Pruning = Remove less important parts [OK]
Hint: Pruning cuts unneeded parts to shrink model size [OK]
Common Mistakes:
Thinking pruning adds layers instead of removing
Confusing pruning with data augmentation
Believing pruning changes programming language
2. Which of the following is the correct way to apply quantization in TensorFlow Lite?
easy
A. model = tf.lite.TFLiteConverter.from_keras_model(model).convert()
B. converter.optimizations = [tf.lite.Optimize.DEFAULT]
C. model.compile(optimizer='adam', loss='mse')
D. model.fit(x_train, y_train, epochs=10)
Solution
Step 1: Identify quantization syntax
In TensorFlow Lite, quantization is enabled by setting converter.optimizations to Optimize.DEFAULT.
Step 2: Check other options
model = tf.lite.TFLiteConverter.from_keras_model(model).convert() converts model but does not enable quantization. Options B and C are training commands, not quantization.
Final Answer:
converter.optimizations = [tf.lite.Optimize.DEFAULT] -> Option B
Quick Check:
Quantization flag = converter.optimizations [OK]
Hint: Quantization needs converter.optimizations set to Optimize.DEFAULT [OK]
Common Mistakes:
Confusing model conversion with quantization
Using training commands instead of conversion flags
Missing the optimization setting for quantization
3. Given this PyTorch pruning code snippet, what will be the output size of the model's first linear layer weights after pruning 20% of connections?
The first linear layer has 100 inputs and 50 outputs, so total weights = 100 * 50 = 5000.
Step 2: Calculate remaining weights after pruning
Pruning 20% removes 20% of weights, so remaining weights = 80% of 5000 = 4000.
Step 3: Understand pruning method
PyTorch's l1_unstructured pruning does not remove weights but masks them, so the weight tensor size remains 5000, but the number of non-zero weights is 4000.
Step 4: Check print output
The print statement counts non-zero weights, so output is 4000.
Using pruning amount as remaining instead of removed
Confusing layer input/output dimensions
4. You tried to quantize a model but got an error: AttributeError: 'TFLiteConverter' object has no attribute 'optimizations'. What is the likely cause?
medium
A. Quantization requires training the model again
B. Model is too large to quantize
C. Using an outdated TensorFlow version without quantization support
D. The model has no weights to quantize
Solution
Step 1: Understand the error
The error says the converter object lacks 'optimizations' attribute, meaning the TensorFlow version is old.
Step 2: Identify cause
Older TensorFlow versions do not support the 'optimizations' attribute needed for quantization.
Final Answer:
Using an outdated TensorFlow version without quantization support -> Option C
Quick Check:
Missing attribute = outdated TensorFlow [OK]
Hint: Check TensorFlow version supports quantization features [OK]
Common Mistakes:
Assuming model size causes attribute error
Thinking quantization needs retraining always
Believing model without weights causes this error
5. You want to deploy a computer vision model on a mobile device with limited memory and CPU. Which combination of optimization techniques is best to reduce model size and speed up inference without much accuracy loss?
hard
A. Apply pruning to remove unimportant weights, then quantize weights to 8-bit integers
B. Only increase model layers to improve accuracy
C. Use full precision weights and no pruning for best accuracy
D. Train longer without any model size changes
Solution
Step 1: Understand device constraints
Mobile devices have limited memory and CPU, so model size and speed matter.