ONNX Runtime lets you run machine learning models fast and on many devices. It helps you use models made in one tool inside another easily.
0
0
ONNX Runtime inference in PyTorch
Introduction
You want to run a PyTorch model faster on a CPU or GPU.
You need to deploy a model to a device that does not support PyTorch directly.
You want to share a model with others who use different frameworks.
You want to run the same model on different platforms like Windows, Linux, or mobile.
You want to compare performance between PyTorch and ONNX Runtime.
Syntax
PyTorch
import onnxruntime # Load the ONNX model session = onnxruntime.InferenceSession('model.onnx') # Prepare input as a dictionary inputs = {session.get_inputs()[0].name: input_array} # Run inference outputs = session.run(None, inputs)
You must export your PyTorch model to ONNX format first.
Input data must be a numpy array matching the model input shape.
Examples
Run inference on a random image-like input for a model expecting 1x3x224x224 input.
PyTorch
import onnxruntime import numpy as np session = onnxruntime.InferenceSession('model.onnx') input_name = session.get_inputs()[0].name input_data = np.random.randn(1, 3, 224, 224).astype(np.float32) outputs = session.run(None, {input_name: input_data})
Run inference with your own prepared input data.
PyTorch
import onnxruntime session = onnxruntime.InferenceSession('model.onnx') input_name = session.get_inputs()[0].name input_data = your_numpy_array outputs = session.run(None, {input_name: input_data})
Sample Model
This code creates a simple linear model in PyTorch, exports it to ONNX format, then loads and runs it using ONNX Runtime. It prints the input and the model's output.
PyTorch
import torch import torch.nn as nn import numpy as np import onnxruntime # Define a simple PyTorch model class SimpleModel(nn.Module): def __init__(self): super().__init__() self.linear = nn.Linear(4, 2) def forward(self, x): return self.linear(x) # Create model and dummy input model = SimpleModel() model.eval() dummy_input = torch.randn(1, 4) # Export to ONNX onnx_path = 'simple_model.onnx' torch.onnx.export(model, dummy_input, onnx_path, input_names=['input'], output_names=['output'], opset_version=11) # Prepare input for ONNX Runtime input_data = dummy_input.numpy() # Load ONNX model with ONNX Runtime session = onnxruntime.InferenceSession(onnx_path) input_name = session.get_inputs()[0].name # Run inference outputs = session.run(None, {input_name: input_data}) print('Input:', input_data) print('Output:', outputs[0])
OutputSuccess
Important Notes
ONNX Runtime supports many hardware accelerators for faster inference.
Make sure the input data type and shape match the model's expected input.
ONNX Runtime can run models exported from many frameworks, not just PyTorch.
Summary
ONNX Runtime helps run machine learning models fast and on many devices.
You export your PyTorch model to ONNX format, then load it with ONNX Runtime.
Prepare input as a numpy array and run inference with session.run().