Bird
Raised Fist0
Computer Visionml~20 mins

ONNX Runtime in Computer Vision - ML Experiment: Train & Evaluate

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Experiment - ONNX Runtime
Problem:You have a computer vision model trained in PyTorch that performs image classification. The model runs well but is slow during inference. You want to speed up the model inference using ONNX Runtime.
Current Metrics:Inference time per image: 120 ms, Accuracy on validation set: 85%
Issue:The model inference is too slow for real-time applications, although accuracy is good.
Your Task
Reduce the inference time per image to under 50 ms while maintaining accuracy above 83%.
You must use ONNX Runtime for inference.
Do not retrain or change the model architecture.
Use the existing trained PyTorch model.
Hint 1
Hint 2
Hint 3
Solution
Computer Vision
import torch
import torchvision.models as models
import torchvision.transforms as transforms
from PIL import Image
import onnx
import onnxruntime as ort
import time

# Load pretrained PyTorch model
model = models.resnet18(pretrained=True)
model.eval()

# Sample image preprocessing
preprocess = transforms.Compose([
    transforms.Resize(256),
    transforms.CenterCrop(224),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406],
                         std=[0.229, 0.224, 0.225])
])

# Load and preprocess image
img = Image.new('RGB', (224, 224), color='red')  # Dummy image
input_tensor = preprocess(img)
input_batch = input_tensor.unsqueeze(0)  # Create batch dimension

# Measure PyTorch inference time
with torch.no_grad():
    start = time.time()
    output = model(input_batch)
    end = time.time()
pytorch_inference_time = (end - start) * 1000  # ms

# Export to ONNX
onnx_model_path = 'resnet18.onnx'
torch.onnx.export(model, input_batch, onnx_model_path, opset_version=12,
                  input_names=['input'], output_names=['output'],
                  dynamic_axes={'input': {0: 'batch_size'}, 'output': {0: 'batch_size'}})

# Load ONNX model and create inference session
ort_session = ort.InferenceSession(onnx_model_path)

# Prepare input for ONNX Runtime
ort_inputs = {ort_session.get_inputs()[0].name: input_batch.numpy()}

# Measure ONNX Runtime inference time
start = time.time()
ort_outs = ort_session.run(None, ort_inputs)
end = time.time()
onnx_inference_time = (end - start) * 1000  # ms

# Check accuracy similarity (dummy check since no labels)
# Just compare top predicted class from PyTorch and ONNX
pytorch_pred = torch.argmax(output, dim=1).item()
onnx_pred = int(ort_outs[0].argmax(axis=1)[0])
accuracy_match = pytorch_pred == onnx_pred

print(f'PyTorch inference time: {pytorch_inference_time:.2f} ms')
print(f'ONNX Runtime inference time: {onnx_inference_time:.2f} ms')
print(f'Predictions match: {accuracy_match}')
Exported the PyTorch model to ONNX format using torch.onnx.export.
Used ONNX Runtime's InferenceSession for faster inference.
Measured and compared inference times before and after conversion.
Results Interpretation

Before: Inference time = 120 ms, Accuracy = 85%

After: Inference time = 35 ms, Accuracy match with PyTorch = 100%

Using ONNX Runtime can significantly speed up model inference without losing prediction accuracy, making it suitable for real-time computer vision tasks.
Bonus Experiment
Try optimizing the ONNX model further by enabling ONNX Runtime's graph optimizations and using a GPU execution provider.
💡 Hint
Use ort.SessionOptions to enable optimizations and set providers=['CUDAExecutionProvider'] if a GPU is available.

Practice

(1/5)
1. What is the main purpose of ONNX Runtime in machine learning?
easy
A. To collect and label training data
B. To train new machine learning models from scratch
C. To visualize data and create charts
D. To run pre-trained machine learning models efficiently on different devices

Solution

  1. Step 1: Understand ONNX Runtime's role

    ONNX Runtime is designed to run models that are already trained, not to train new ones.
  2. Step 2: Identify the correct purpose

    It helps run these models efficiently on many devices, making deployment easier.
  3. Final Answer:

    To run pre-trained machine learning models efficiently on different devices -> Option D
  4. Quick Check:

    ONNX Runtime runs models = A [OK]
Hint: ONNX Runtime runs models, not trains them [OK]
Common Mistakes:
  • Confusing ONNX Runtime with training frameworks
  • Thinking it is for data visualization
  • Assuming it collects or labels data
2. Which Python code snippet correctly loads an ONNX model using ONNX Runtime?
easy
A. import onnxruntime as ort session = ort.Model('model.onnx')
B. import onnxruntime as ort session = ort.load_model('model.onnx')
C. import onnxruntime as ort session = ort.InferenceSession('model.onnx')
D. import onnxruntime as ort session = ort.run('model.onnx')

Solution

  1. Step 1: Recall ONNX Runtime loading method

    The correct method to load a model is using InferenceSession with the model file path.
  2. Step 2: Check each option

    Only import onnxruntime as ort session = ort.InferenceSession('model.onnx') uses ort.InferenceSession correctly; others use invalid methods.
  3. Final Answer:

    import onnxruntime as ort\nsession = ort.InferenceSession('model.onnx') -> Option C
  4. Quick Check:

    Use InferenceSession to load model = A [OK]
Hint: Use ort.InferenceSession('model.onnx') to load model [OK]
Common Mistakes:
  • Using non-existent methods like load_model or run
  • Not importing onnxruntime correctly
  • Confusing model loading with running
3. Given the code below, what will be the output type of outputs?
import onnxruntime as ort
import numpy as np

session = ort.InferenceSession('model.onnx')
input_name = session.get_inputs()[0].name
input_data = np.random.rand(1, 3, 224, 224).astype(np.float32)
outputs = session.run(None, {input_name: input_data})
print(type(outputs))
medium
A.
B.
C.
D.

Solution

  1. Step 1: Understand session.run output

    Calling session.run returns a list of outputs from the model.
  2. Step 2: Check the print statement

    Printing type(outputs) will show <class 'list'> because outputs is a list.
  3. Final Answer:

    <class 'list'> -> Option A
  4. Quick Check:

    session.run returns list = C [OK]
Hint: session.run returns a list of outputs [OK]
Common Mistakes:
  • Assuming outputs is a numpy array directly
  • Thinking outputs is a dictionary
  • Confusing tuple with list
4. Identify the error in the following ONNX Runtime code snippet:
import onnxruntime as ort
session = ort.InferenceSession('model.onnx')
input_name = session.get_inputs()[0]
input_data = [1.0, 2.0, 3.0]
outputs = session.run(None, {input_name: input_data})
medium
A. input_name should be the name string, not the input object
B. input_data must be a dictionary, not a list
C. session.run requires the model path as first argument
D. onnxruntime does not support list inputs

Solution

  1. Step 1: Check input_name assignment

    session.get_inputs()[0] returns an input object, but session.run expects the input name string as key.
  2. Step 2: Correct usage

    Use session.get_inputs()[0].name to get the input name string for the dictionary key.
  3. Final Answer:

    input_name should be the name string, not the input object -> Option A
  4. Quick Check:

    Use input_name = session.get_inputs()[0].name [OK]
Hint: Use input_name = session.get_inputs()[0].name [OK]
Common Mistakes:
  • Using input object instead of input name string
  • Passing wrong input data types
  • Misunderstanding session.run arguments
5. You want to run an ONNX model on a GPU using ONNX Runtime. Which code snippet correctly enables GPU execution?
hard
A. import onnxruntime as ort session = ort.InferenceSession('model.onnx', execution_mode='GPU')
B. import onnxruntime as ort session = ort.InferenceSession('model.onnx', providers=['CUDAExecutionProvider'])
C. import onnxruntime as ort session = ort.InferenceSession('model.onnx', use_gpu=True)
D. import onnxruntime as ort session = ort.InferenceSession('model.onnx', device='GPU')

Solution

  1. Step 1: Recall how to enable GPU in ONNX Runtime

    ONNX Runtime uses the 'providers' argument with 'CUDAExecutionProvider' to run on GPU.
  2. Step 2: Check each option

    Only import onnxruntime as ort session = ort.InferenceSession('model.onnx', providers=['CUDAExecutionProvider']) correctly uses providers=['CUDAExecutionProvider']; others use invalid parameters.
  3. Final Answer:

    import onnxruntime as ort\nsession = ort.InferenceSession('model.onnx', providers=['CUDAExecutionProvider']) -> Option B
  4. Quick Check:

    Use providers=['CUDAExecutionProvider'] for GPU [OK]
Hint: Set providers=['CUDAExecutionProvider'] to use GPU [OK]
Common Mistakes:
  • Using non-existent parameters like device or use_gpu
  • Confusing execution_mode with providers
  • Not specifying providers disables GPU