When converting a model to TensorFlow.js, the key metric to check is model accuracy or the main performance metric used during training. This ensures the model still works well after conversion. Also, model size and inference speed matter because TensorFlow.js runs in browsers where resources are limited.
TensorFlow.js conversion - Model Metrics & Evaluation
After conversion, test the model on a validation set and compare predictions to true labels. For example, a confusion matrix for a classification model might look like this:
| Predicted Positive | Predicted Negative |
|--------------------|--------------------|
| True Positive (TP): 45 | False Negative (FN): 5 |
| False Positive (FP): 3 | True Negative (TN): 47 |
Check that these numbers are close to the original model's results to confirm conversion quality.
After conversion, precision and recall should stay similar to the original model. For example, if the model detects spam emails, high precision means fewer good emails marked as spam, and high recall means catching most spam emails. If conversion lowers recall, the model might miss spam emails in the browser.
Balancing these metrics is important to keep user experience good after conversion.
Good: The converted model has accuracy, precision, and recall within 1-2% of the original. Model size is small enough to load quickly in browsers. Inference time is fast enough for smooth user interaction.
Bad: Accuracy drops by more than 5%, or precision/recall drop significantly. Model size is too large causing slow loading. Inference is slow, causing lag in the app.
- Ignoring metric drop: Not checking if accuracy or other metrics drop after conversion.
- Data leakage: Using test data during conversion validation can give false good results.
- Overfitting signs: If converted model performs much better on training data than validation, it may be overfitting.
- Model size vs performance: Reducing model size too much can hurt accuracy.
Your model converted to TensorFlow.js has 98% accuracy but only 12% recall on fraud detection. Is it good for production? Why or why not?
Answer: No, it is not good. Even though accuracy is high, the very low recall means the model misses most fraud cases. For fraud detection, recall is critical because missing fraud is costly. The model needs improvement before production.