Bird
Raised Fist0
Computer Visionml~8 mins

ONNX Runtime in Computer Vision - Model Metrics & Evaluation

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Metrics & Evaluation - ONNX Runtime
Which metric matters for ONNX Runtime and WHY

ONNX Runtime is a tool to run machine learning models fast and efficiently. When using it for computer vision, the key metrics are inference speed and model accuracy. Speed matters because ONNX Runtime helps models make predictions quickly, which is important for real-time tasks like object detection in videos. Accuracy matters because a fast model that makes wrong predictions is not useful. So, we want to measure how fast the model runs and how correct its predictions are.

Confusion matrix example for ONNX Runtime model
    Confusion Matrix (Example for a 2-class image classifier):

          Predicted
          Cat   Dog
    Actual
    Cat   85    15
    Dog   10    90

    Total samples = 85 + 15 + 10 + 90 = 200

    True Positives (TP) = 85 (correctly predicted Cat)
    False Positives (FP) = 10 (Dog predicted as Cat)
    True Negatives (TN) = 90 (correctly predicted Dog)
    False Negatives (FN) = 15 (Cat predicted as Dog)
    
Precision vs Recall tradeoff with ONNX Runtime

Imagine ONNX Runtime runs a model to detect cats in photos.

  • Precision means: Of all photos predicted as cats, how many really are cats? High precision means fewer false alarms.
  • Recall means: Of all actual cat photos, how many did the model find? High recall means fewer missed cats.

If ONNX Runtime speeds up the model but the model misses many cats (low recall), it is not good for applications like pet monitoring. If it finds many cats but also mistakes dogs for cats (low precision), it causes false alerts.

So, ONNX Runtime helps balance speed with maintaining good precision and recall.

Good vs Bad metric values for ONNX Runtime models

Good values:

  • Accuracy above 90% on test images
  • Precision and recall both above 85%
  • Inference time reduced by 50% compared to original model

Bad values:

  • Accuracy below 70%, meaning many wrong predictions
  • Precision or recall below 50%, causing many false alarms or misses
  • Inference speed not improved or slower, defeating ONNX Runtime's purpose
Common pitfalls when evaluating ONNX Runtime models
  • Ignoring accuracy drop: Speeding up with ONNX Runtime may reduce accuracy if model conversion is not done carefully.
  • Data leakage: Testing on data the model saw during training gives false high accuracy.
  • Overfitting: Model performs well on training but poorly on new images, misleading metrics.
  • Measuring only speed: Fast inference is good but useless if predictions are wrong.
Self-check question

Your ONNX Runtime model runs 3 times faster than the original but has 98% accuracy and only 12% recall on detecting a rare object. Is it good for production? Why or why not?

Answer: No, it is not good. Although the model is fast and has high overall accuracy, the very low recall means it misses most rare objects. For rare object detection, missing them is critical, so recall must be higher even if speed is slightly lower.

Key Result
ONNX Runtime models must balance fast inference speed with high accuracy, precision, and recall to be effective.

Practice

(1/5)
1. What is the main purpose of ONNX Runtime in machine learning?
easy
A. To collect and label training data
B. To train new machine learning models from scratch
C. To visualize data and create charts
D. To run pre-trained machine learning models efficiently on different devices

Solution

  1. Step 1: Understand ONNX Runtime's role

    ONNX Runtime is designed to run models that are already trained, not to train new ones.
  2. Step 2: Identify the correct purpose

    It helps run these models efficiently on many devices, making deployment easier.
  3. Final Answer:

    To run pre-trained machine learning models efficiently on different devices -> Option D
  4. Quick Check:

    ONNX Runtime runs models = A [OK]
Hint: ONNX Runtime runs models, not trains them [OK]
Common Mistakes:
  • Confusing ONNX Runtime with training frameworks
  • Thinking it is for data visualization
  • Assuming it collects or labels data
2. Which Python code snippet correctly loads an ONNX model using ONNX Runtime?
easy
A. import onnxruntime as ort session = ort.Model('model.onnx')
B. import onnxruntime as ort session = ort.load_model('model.onnx')
C. import onnxruntime as ort session = ort.InferenceSession('model.onnx')
D. import onnxruntime as ort session = ort.run('model.onnx')

Solution

  1. Step 1: Recall ONNX Runtime loading method

    The correct method to load a model is using InferenceSession with the model file path.
  2. Step 2: Check each option

    Only import onnxruntime as ort session = ort.InferenceSession('model.onnx') uses ort.InferenceSession correctly; others use invalid methods.
  3. Final Answer:

    import onnxruntime as ort\nsession = ort.InferenceSession('model.onnx') -> Option C
  4. Quick Check:

    Use InferenceSession to load model = A [OK]
Hint: Use ort.InferenceSession('model.onnx') to load model [OK]
Common Mistakes:
  • Using non-existent methods like load_model or run
  • Not importing onnxruntime correctly
  • Confusing model loading with running
3. Given the code below, what will be the output type of outputs?
import onnxruntime as ort
import numpy as np

session = ort.InferenceSession('model.onnx')
input_name = session.get_inputs()[0].name
input_data = np.random.rand(1, 3, 224, 224).astype(np.float32)
outputs = session.run(None, {input_name: input_data})
print(type(outputs))
medium
A.
B.
C.
D.

Solution

  1. Step 1: Understand session.run output

    Calling session.run returns a list of outputs from the model.
  2. Step 2: Check the print statement

    Printing type(outputs) will show <class 'list'> because outputs is a list.
  3. Final Answer:

    <class 'list'> -> Option A
  4. Quick Check:

    session.run returns list = C [OK]
Hint: session.run returns a list of outputs [OK]
Common Mistakes:
  • Assuming outputs is a numpy array directly
  • Thinking outputs is a dictionary
  • Confusing tuple with list
4. Identify the error in the following ONNX Runtime code snippet:
import onnxruntime as ort
session = ort.InferenceSession('model.onnx')
input_name = session.get_inputs()[0]
input_data = [1.0, 2.0, 3.0]
outputs = session.run(None, {input_name: input_data})
medium
A. input_name should be the name string, not the input object
B. input_data must be a dictionary, not a list
C. session.run requires the model path as first argument
D. onnxruntime does not support list inputs

Solution

  1. Step 1: Check input_name assignment

    session.get_inputs()[0] returns an input object, but session.run expects the input name string as key.
  2. Step 2: Correct usage

    Use session.get_inputs()[0].name to get the input name string for the dictionary key.
  3. Final Answer:

    input_name should be the name string, not the input object -> Option A
  4. Quick Check:

    Use input_name = session.get_inputs()[0].name [OK]
Hint: Use input_name = session.get_inputs()[0].name [OK]
Common Mistakes:
  • Using input object instead of input name string
  • Passing wrong input data types
  • Misunderstanding session.run arguments
5. You want to run an ONNX model on a GPU using ONNX Runtime. Which code snippet correctly enables GPU execution?
hard
A. import onnxruntime as ort session = ort.InferenceSession('model.onnx', execution_mode='GPU')
B. import onnxruntime as ort session = ort.InferenceSession('model.onnx', providers=['CUDAExecutionProvider'])
C. import onnxruntime as ort session = ort.InferenceSession('model.onnx', use_gpu=True)
D. import onnxruntime as ort session = ort.InferenceSession('model.onnx', device='GPU')

Solution

  1. Step 1: Recall how to enable GPU in ONNX Runtime

    ONNX Runtime uses the 'providers' argument with 'CUDAExecutionProvider' to run on GPU.
  2. Step 2: Check each option

    Only import onnxruntime as ort session = ort.InferenceSession('model.onnx', providers=['CUDAExecutionProvider']) correctly uses providers=['CUDAExecutionProvider']; others use invalid parameters.
  3. Final Answer:

    import onnxruntime as ort\nsession = ort.InferenceSession('model.onnx', providers=['CUDAExecutionProvider']) -> Option B
  4. Quick Check:

    Use providers=['CUDAExecutionProvider'] for GPU [OK]
Hint: Set providers=['CUDAExecutionProvider'] to use GPU [OK]
Common Mistakes:
  • Using non-existent parameters like device or use_gpu
  • Confusing execution_mode with providers
  • Not specifying providers disables GPU