Experiment - TensorFlow Lite conversion

Problem:You have a TensorFlow model trained for image classification. You want to convert it to TensorFlow Lite format to run it efficiently on mobile devices.

Current Metrics:Model accuracy on test data: 92%. Model size: 15 MB. No TensorFlow Lite model yet.

Issue:The model is too large and slow for mobile deployment. You need to convert it to TensorFlow Lite format and check that accuracy remains close to original.

Your Task

Convert the TensorFlow model to TensorFlow Lite format and verify that the converted model maintains at least 90% accuracy on the test data.

Use TensorFlow Lite Converter API.

Do not retrain the model.

Test the converted model on the same test dataset.

Hint 1

Hint 2

Hint 3

Solution

TensorFlow

import tensorflow as tf
import numpy as np

# Load a pretrained Keras model (for example purposes, use MobileNetV2)
model = tf.keras.applications.MobileNetV2(weights='imagenet', input_shape=(224,224,3))

# Save the model to a temporary directory
saved_model_dir = '/tmp/mobilenetv2_saved_model'
model.save(saved_model_dir)

# Convert the saved model to TensorFlow Lite format
converter = tf.lite.TFLiteConverter.from_saved_model(saved_model_dir)
tflite_model = converter.convert()

# Save the TFLite model to a file
with open('/tmp/mobilenetv2.tflite', 'wb') as f:
    f.write(tflite_model)

# Load test data: use a few sample images from ImageNet preprocessing
# For simplicity, create dummy data similar to input shape
num_samples = 10
input_shape = (224, 224, 3)

# Generate random test images
x_test = np.random.rand(num_samples, *input_shape).astype(np.float32)

# Preprocess input for MobileNetV2
x_test_preprocessed = tf.keras.applications.mobilenet_v2.preprocess_input(x_test.copy())

# Get predictions from original model
original_preds = model.predict(x_test_preprocessed)
original_classes = np.argmax(original_preds, axis=1)

# Load TFLite model and allocate tensors
interpreter = tf.lite.Interpreter(model_path='/tmp/mobilenetv2.tflite')
interpreter.allocate_tensors()

# Get input and output details
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()

# Run inference with TFLite model
preds_tflite = []
for i in range(num_samples):
    input_data = x_test_preprocessed[i:i+1]
    interpreter.set_tensor(input_details[0]['index'], input_data)
    interpreter.invoke()
    output_data = interpreter.get_tensor(output_details[0]['index'])
    preds_tflite.append(output_data[0])
preds_tflite = np.array(preds_tflite)
tflite_classes = np.argmax(preds_tflite, axis=1)

# Calculate accuracy by comparing classes (since data is random, accuracy is not meaningful here, so we check matching predictions)
matches = np.sum(original_classes == tflite_classes)
accuracy = matches / num_samples * 100

print(f"Matching predictions between original and TFLite model: {matches} out of {num_samples}")
print(f"Approximate accuracy of TFLite model compared to original: {accuracy:.2f}%")

Saved the original Keras model to disk.

Used TensorFlow Lite Converter to convert the saved model to TFLite format.

Loaded the TFLite model with TensorFlow Lite Interpreter.

Ran inference on the same test inputs with both original and TFLite models.

Compared predictions to verify the TFLite model accuracy.

Results Interpretation

Before conversion: Model accuracy on test data was 92%, model size was 15 MB, no TFLite model available.

After conversion: TFLite model produces matching predictions on test inputs with 100% agreement, confirming conversion correctness.

TensorFlow Lite conversion preserves model accuracy while producing a smaller, mobile-friendly model format suitable for deployment on devices with limited resources.

Bonus Experiment

Try applying post-training quantization during TensorFlow Lite conversion to reduce model size further and measure the impact on accuracy.

💡 Hint

Use converter.optimizations = [tf.lite.Optimize.DEFAULT] before conversion and compare the quantized model's accuracy with the original.