MLOpsdevops~10 mins

Model optimization for serving (quantization, pruning) in MLOps - Interactive Code Practice

Choose your learning style10 modes available

Learn Why Deep Visual Try Challenge Project Recall Time

Start learning this pattern below

Jump into concepts and practice - no test required

Recommended

Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong

Practice - 5 Tasks

Answer the questions below

1fill in blank

easy

Complete the code to apply post-training quantization using TensorFlow Lite.

MLOps

converter = tf.lite.TFLiteConverter.from_saved_model(saved_model_dir)
converter.optimizations = [[1]]
tflite_model = converter.convert()

Drag options to blanks, or click blank then click option'

Atf.lite.Optimize.DEFAULT

Btf.lite.Quantize.DEFAULT

Ctf.lite.Optimize.QUANTIZE

Dtf.lite.Optimize.NONE

Attempts:

3 left

2fill in blank

medium

Complete the code to prune a Keras model with 50% sparsity.

MLOps

import tensorflow_model_optimization as tfmot
prune_low_magnitude = tfmot.sparsity.keras.prune_low_magnitude
pruning_params = {'pruning_schedule': tfmot.sparsity.keras.PolynomialDecay(initial_sparsity=0.0, final_sparsity=[1], begin_step=0, end_step=1000)}
model = prune_low_magnitude(original_model, **pruning_params)

Drag options to blanks, or click blank then click option'

A50

B0.5

C0.05

Attempts:

3 left

3fill in blank

hard

Fix the error in the pruning callback setup to correctly update pruning steps during training.

MLOps

callbacks = [tfmot.sparsity.keras.UpdatePruningStep(), [1]]
model.fit(train_data, epochs=10, callbacks=callbacks)

Drag options to blanks, or click blank then click option'

Atf.keras.callbacks.EarlyStopping()

Btfmot.sparsity.keras.PruningCallback()

Ctfmot.sparsity.keras.PruningSummaries(log_dir)

Dtf.keras.callbacks.ModelCheckpoint()

Attempts:

3 left

4fill in blank

hard

Fill both blanks to create a TensorFlow Lite converter that applies quantization and sets the supported ops.

MLOps

converter = tf.lite.TFLiteConverter.from_saved_model(saved_model_dir)
converter.optimizations = [[1]]
converter.target_spec.supported_ops = [[2]]
tflite_model = converter.convert()

Drag options to blanks, or click blank then click option'

Atf.lite.Optimize.DEFAULT

Btf.lite.OpsSet.TFLITE_BUILTINS_INT8

Ctf.lite.OpsSet.TFLITE_BUILTINS

Dtf.lite.Optimize.NONE

Attempts:

3 left

5fill in blank

hard

Fill all three blanks to create a pruning schedule with 20% initial sparsity, 80% final sparsity, and pruning ending at step 2000.

MLOps

pruning_params = {
  'pruning_schedule': tfmot.sparsity.keras.PolynomialDecay(initial_sparsity=[1], final_sparsity=[2], begin_step=0, end_step=[3])
}

Drag options to blanks, or click blank then click option'

A0.2

B0.8

C2000

D1000

Attempts:

3 left

Practice

(1/5)

1. What is the main goal of quantization in model optimization for serving?

easy

A. Increase the size of the model for better performance

B. Reduce the precision of numbers to make the model smaller and faster

C. Add more neurons to improve accuracy

D. Remove entire layers from the model to simplify it

Model optimization for serving (quantization, pruning) in MLOps - Interactive Code Practice

Start learning this pattern below

Practice

Solution

Step 1: Understand quantization purpose

Step 2: Compare options

Final Answer:

Quick Check:

Solution

Step 1: Recall TensorFlow pruning API structure

Step 2: Check syntax correctness

Final Answer:

Quick Check:

Solution

Step 1: Analyze dynamic quantization effect

Step 2: Trace the print statement

Final Answer:

Quick Check:

Solution

Step 1: Understand the error message

Step 2: Check common causes

Final Answer:

Quick Check:

Solution

Step 1: Understand pruning and quantization order

Step 2: Apply quantization after pruning

Final Answer:

Quick Check: