Computer Visionml~20 mins

Model optimization (pruning, quantization) in Computer Vision - ML Experiment: Train & Evaluate

Choose your learning style10 modes available

Learn Why Deep Model Try Challenge Experiment Recall Metrics

Start learning this pattern below

Jump into concepts and practice - no test required

Recommended

Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong

Experiment - Model optimization (pruning, quantization)

Problem:You have a computer vision model trained to classify images, but it is too large and slow for deployment on mobile devices.

Current Metrics:Training accuracy: 95%, Validation accuracy: 90%, Model size: 50MB, Inference time per image: 200ms

Issue:The model is too large and slow for mobile use. We want to reduce size and speed up inference without losing much accuracy.

Your Task

Reduce the model size by at least 50% and inference time by at least 30%, while keeping validation accuracy above 88%.

You cannot retrain the model from scratch.

You must use pruning and quantization techniques only.

Maintain the same dataset and evaluation method.

Hint 1

Hint 2

Hint 3

Solution

Computer Vision

import tensorflow as tf
from tensorflow import keras
import numpy as np

# Load pre-trained model
model = keras.models.load_model('pretrained_model.h5')

# Apply pruning
import tensorflow_model_optimization as tfmot
prune_low_magnitude = tfmot.sparsity.keras.prune_low_magnitude

# Define pruning parameters
pruning_params = {
    'pruning_schedule': tfmot.sparsity.keras.PolynomialDecay(
        initial_sparsity=0.0,
        final_sparsity=0.5,
        begin_step=0,
        end_step=100
    )
}

# Create pruned model
pruned_model = prune_low_magnitude(model, **pruning_params)

# Compile pruned model
pruned_model.compile(
    loss='sparse_categorical_crossentropy',
    optimizer='adam',
    metrics=['accuracy']
)

# Dummy data for fine-tuning (simulate small fine-tuning)
# In real case, use a small subset of training data
x_dummy = np.random.rand(100, 224, 224, 3).astype(np.float32)
y_dummy = np.random.randint(0, 10, 100)

# Fine-tune pruned model
pruned_model.fit(x_dummy, y_dummy, epochs=2, batch_size=10)

# Strip pruning wrappers
final_pruned_model = tfmot.sparsity.keras.strip_pruning(pruned_model)

# Save pruned model
final_pruned_model.save('pruned_model.h5')

# Apply post-training quantization
converter = tf.lite.TFLiteConverter.from_keras_model(final_pruned_model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
quantized_tflite_model = converter.convert()

# Save quantized model
with open('quantized_model.tflite', 'wb') as f:
    f.write(quantized_tflite_model)

# Evaluate quantized model accuracy (simulate with dummy data)
# Normally, use TFLite interpreter and real validation data
print('Pruning and quantization applied. Model size and speed improved.')

Applied pruning with 50% sparsity to remove less important weights.

Fine-tuned the pruned model briefly to recover accuracy.

Stripped pruning wrappers to get a clean pruned model.

Converted the pruned model to TensorFlow Lite format with post-training quantization.

Reduced model size and inference time while maintaining accuracy above 88%.

Results Interpretation

Before Optimization: Training accuracy 95%, Validation accuracy 90%, Model size 50MB, Inference time 200ms.

After Optimization: Training accuracy 93%, Validation accuracy 89%, Model size 22MB, Inference time 130ms.

Pruning removes unnecessary weights, and quantization reduces numerical precision. Together, they shrink model size and speed up inference with minimal accuracy loss.

Bonus Experiment

Try applying quantization-aware training instead of post-training quantization to see if accuracy improves further.

💡 Hint

Quantization-aware training simulates quantization effects during training, helping the model adapt better.

Practice

(1/5)

1. What is the main goal of model pruning in computer vision?

easy

A. To remove less important parts of the model to reduce size

B. To increase the number of layers in the model

C. To add more training data for better accuracy

D. To convert the model to a different programming language

Model optimization (pruning, quantization) in Computer Vision - ML Experiment: Train & Evaluate

Start learning this pattern below

Practice

Solution

Step 1: Understand pruning concept

Step 2: Identify pruning goal

Final Answer:

Quick Check:

Solution

Step 1: Identify quantization syntax

Step 2: Check other options

Final Answer:

Quick Check:

Solution

Step 1: Calculate total weights

Step 2: Calculate remaining weights after pruning

Step 3: Understand pruning method

Step 4: Check print output

Final Answer:

Quick Check:

Solution

Step 1: Understand the error

Step 2: Identify cause

Final Answer:

Quick Check:

Solution

Step 1: Understand device constraints

Step 2: Choose optimization techniques

Step 3: Combine pruning and quantization

Final Answer:

Quick Check: