0
0
Prompt Engineering / GenAIml~8 mins

Emerging trends (smaller models, edge AI) in Prompt Engineering / GenAI - Model Metrics & Evaluation

Choose your learning style9 modes available
Metrics & Evaluation - Emerging trends (smaller models, edge AI)
Which metric matters for Emerging Trends (smaller models, edge AI) and WHY

For smaller models and edge AI, key metrics include model size, latency, and energy efficiency. Accuracy remains important but must be balanced with these constraints. We want models that are small and fast enough to run on devices like phones or sensors, while still making good predictions.

Confusion matrix example for edge AI classification
      | Predicted Positive | Predicted Negative |
      |--------------------|--------------------|
      | True Positive (TP): 40 | False Negative (FN): 10 |
      | False Positive (FP): 5 | True Negative (TN): 45 |

      Total samples = 40 + 10 + 5 + 45 = 100

      Precision = TP / (TP + FP) = 40 / (40 + 5) = 0.89
      Recall = TP / (TP + FN) = 40 / (40 + 10) = 0.80
      Accuracy = (TP + TN) / Total = (40 + 45) / 100 = 0.85
    

This shows a balanced model that works well on-device with good precision and recall.

Precision vs Recall tradeoff in edge AI

Imagine a smart home camera detecting intruders. High precision means it rarely mistakes a family member for an intruder (few false alarms). High recall means it catches almost all real intruders (few misses). On edge devices, we must balance these because complex models that improve recall might be too slow or large.

Choosing the right tradeoff depends on what matters more: avoiding false alarms (precision) or catching every threat (recall).

Good vs Bad metric values for smaller models and edge AI

Good: Accuracy around 85%+, precision and recall balanced above 80%, model size under 10MB, latency under 100ms, and low power use.

Bad: Accuracy below 70%, very low recall (missing many cases), model size too large to run on device, or latency causing slow responses.

Common pitfalls in evaluating smaller models and edge AI
  • Ignoring latency and size: A model with great accuracy but too big or slow is unusable on edge.
  • Overfitting: Small models can overfit if not trained well, leading to poor real-world results.
  • Data leakage: Using test data during training inflates accuracy falsely.
  • Accuracy paradox: High accuracy on imbalanced data can be misleading if recall or precision is low.
Self-check question

Your edge AI model has 98% accuracy but only 12% recall on detecting faults. Is it good for production? Why or why not?

Answer: No, it is not good. The low recall means the model misses most faults, which is critical to detect. High accuracy can be misleading if the data is imbalanced with many normal cases.

Key Result
For smaller models and edge AI, balance accuracy with model size, latency, and energy use to ensure practical, effective deployment.