0
0
PyTorchml~5 mins

ONNX Runtime inference in PyTorch

Choose your learning style9 modes available
Introduction

ONNX Runtime lets you run machine learning models fast and on many devices. It helps you use models made in one tool inside another easily.

You want to run a PyTorch model faster on a CPU or GPU.
You need to deploy a model to a device that does not support PyTorch directly.
You want to share a model with others who use different frameworks.
You want to run the same model on different platforms like Windows, Linux, or mobile.
You want to compare performance between PyTorch and ONNX Runtime.
Syntax
PyTorch
import onnxruntime

# Load the ONNX model
session = onnxruntime.InferenceSession('model.onnx')

# Prepare input as a dictionary
inputs = {session.get_inputs()[0].name: input_array}

# Run inference
outputs = session.run(None, inputs)

You must export your PyTorch model to ONNX format first.

Input data must be a numpy array matching the model input shape.

Examples
Run inference on a random image-like input for a model expecting 1x3x224x224 input.
PyTorch
import onnxruntime
import numpy as np

session = onnxruntime.InferenceSession('model.onnx')
input_name = session.get_inputs()[0].name
input_data = np.random.randn(1, 3, 224, 224).astype(np.float32)
outputs = session.run(None, {input_name: input_data})
Run inference with your own prepared input data.
PyTorch
import onnxruntime

session = onnxruntime.InferenceSession('model.onnx')
input_name = session.get_inputs()[0].name
input_data = your_numpy_array
outputs = session.run(None, {input_name: input_data})
Sample Model

This code creates a simple linear model in PyTorch, exports it to ONNX format, then loads and runs it using ONNX Runtime. It prints the input and the model's output.

PyTorch
import torch
import torch.nn as nn
import numpy as np
import onnxruntime

# Define a simple PyTorch model
class SimpleModel(nn.Module):
    def __init__(self):
        super().__init__()
        self.linear = nn.Linear(4, 2)
    def forward(self, x):
        return self.linear(x)

# Create model and dummy input
model = SimpleModel()
model.eval()
dummy_input = torch.randn(1, 4)

# Export to ONNX
onnx_path = 'simple_model.onnx'
torch.onnx.export(model, dummy_input, onnx_path, input_names=['input'], output_names=['output'], opset_version=11)

# Prepare input for ONNX Runtime
input_data = dummy_input.numpy()

# Load ONNX model with ONNX Runtime
session = onnxruntime.InferenceSession(onnx_path)
input_name = session.get_inputs()[0].name

# Run inference
outputs = session.run(None, {input_name: input_data})

print('Input:', input_data)
print('Output:', outputs[0])
OutputSuccess
Important Notes

ONNX Runtime supports many hardware accelerators for faster inference.

Make sure the input data type and shape match the model's expected input.

ONNX Runtime can run models exported from many frameworks, not just PyTorch.

Summary

ONNX Runtime helps run machine learning models fast and on many devices.

You export your PyTorch model to ONNX format, then load it with ONNX Runtime.

Prepare input as a numpy array and run inference with session.run().