PyTorchml~8 mins

Mobile deployment (PyTorch Mobile) - Model Metrics & Evaluation

Choose your learning style9 modes available

Learn Why Deep Model Try Challenge Experiment Recall Metrics

Metrics & Evaluation - Mobile deployment (PyTorch Mobile)

Which metric matters for Mobile deployment (PyTorch Mobile) and WHY

When deploying models on mobile devices, the key metrics are model size, inference latency, and accuracy.

Model size matters because mobile devices have limited storage. Smaller models fit better and load faster.

Inference latency is how fast the model makes predictions on the device. Faster means better user experience.

Accuracy shows how well the model predicts. We want to keep accuracy high even after making the model smaller or faster.

Balancing these metrics ensures the model works well and quickly on mobile phones.

Confusion matrix example for a mobile image classifier

    Actual \ Predicted | Cat | Dog | Bird
    -------------------|-----|-----|-----
    Cat                | 50  |  5  |  0
    Dog                |  3  | 45  |  2
    Bird               |  0  |  4  | 46

    Total samples = 155

From this matrix, we can calculate precision and recall for each class to check if the model still performs well on mobile.

Tradeoff: Accuracy vs Model Size and Speed on Mobile

Imagine you have a big, accurate model but it is slow and large. It drains battery and takes time to load.

If you shrink the model to make it faster and smaller, accuracy might drop.

For example, a model with 95% accuracy and 50MB size might be too big for mobile.

Reducing size to 10MB might lower accuracy to 90%, but the app runs smoothly.

Choosing the right balance depends on the app's needs: fast and small with slightly less accuracy, or bigger and slower but more accurate.

Good vs Bad metric values for Mobile deployment

Good:

Model size < 20MB
Inference latency < 100ms per prediction
Accuracy > 90% (close to original model)

Bad:

Model size > 100MB (too large for most phones)
Inference latency > 500ms (slow response)
Accuracy < 80% (too many wrong predictions)

Common pitfalls in Mobile deployment metrics

Ignoring latency: A model with good accuracy but slow speed frustrates users.
Overfitting: Model performs well on test data but poorly on real mobile data.
Data leakage: Training data too similar to test data inflates accuracy falsely.
Not testing on device: Metrics from desktop may not reflect mobile performance.
Ignoring battery impact: Heavy models drain battery quickly.

Self-check question

Your mobile model has 98% accuracy but takes 800ms per prediction and is 120MB in size. Is it good for mobile deployment? Why or why not?

Answer: No, it is not good. Although accuracy is high, the model is too large and slow for mobile devices. It will cause delays and use too much storage, harming user experience.

Key Result

For mobile deployment, balancing model size, inference speed, and accuracy is key to good user experience.