Execution providers
Execution providers (EPs) enable ONNX Runtime to execute ONNX graphs with hardware acceleration. If you have specialized hardware like a GPU or NPU, execution providers can provide a massive performance boost to your ort applications. For more information on the intricacies of execution providers, see the ONNX Runtime docs .
Not all platforms support all execution providers, of course. This handy widget will show you which EPs are supported on your platform, and which ones have binaries available (and are thus ready to use in ort!)
Registering execution providers
To enable the use of an execution provider inside ort, you’ll need to enable its respective Cargo feature, e.g. the cuda feature to use CUDA, or the coreml feature to use CoreML.
[dependencies]
ort = { version = "2.0", features = [ "cuda" ] }See the widget above for the full list of EPs and their corresponding Cargo features.
In order to configure sessions to use certain execution providers, you must register them when creating an environment or session. You can do this via the SessionBuilder::with_execution_providers method. For example, to register the CUDA execution provider for a session:
use ort::{ep::CUDA, session::Session};
fn main() -> anyhow::Result<()> {
let session = Session::builder()?
.with_execution_providers([CUDA::default().build()])?
.commit_from_file("model.onnx")?;
Ok(())
}You can, of course, specify multiple execution providers. ort will register all EPs specified, in order. If an EP does not support a certain operator in a graph, it will fall back to the next successfully registered EP, or to the CPU if all else fails.
use ort::{ep, session::Session};
fn main() -> anyhow::Result<()> {
let session = Session::builder()?
.with_execution_providers([
// Prefer TensorRT over CUDA.
ep::TensorRT::default().build(),
ep::CUDA::default().build(),
// Use DirectML on Windows if NVIDIA EPs are not available
ep::DirectML::default().build(),
// Or use ANE on Apple platforms
ep::CoreML::default().build()
])?
.commit_from_file("model.onnx")?;
Ok(())
}Configuring EPs
EPs have configuration options to control behavior or increase performance. Each execution provider struct returns a builder with configuration methods. See the API reference for the EP structs for more information on which options are supported and what they do.
use ort::{ep, session::Session};
fn main() -> anyhow::Result<()> {
let session = Session::builder()?
.with_execution_providers([
ep::CoreML::default()
// this model uses control flow operators, so enable CoreML on subgraphs too
.with_subgraphs()
.with_compute_units(ep::coreml::ComputeUnits::CPUAndNeuralEngine)
.build()
])?
.commit_from_file("model.onnx")?;
Ok(())
}Fallback behavior
ort will silently fail and fall back to executing on the CPU if all execution providers fail to register. In many cases, though, you’ll want to show the user an error message when an EP fails to register, or outright abort the process.
You can configure an EP to return an error on failure by adding .error_on_failure() after you .build() it. In this example, if CUDA doesn’t register successfully, the program will exit with an error at with_execution_providers:
use ort::{ep, session::Session};
fn main() -> anyhow::Result<()> {
let session = Session::builder()?
.with_execution_providers([
ep::CUDA::default().build().error_on_failure()
])?
.commit_from_file("model.onnx")?;
Ok(())
}If you require more complex error handling, you can also manually register execution providers via the ExecutionProvider::register method:
use ort::{
ep::{self, ExecutionProvider},
session::Session
};
fn main() -> anyhow::Result<()> {
let builder = Session::builder()?;
let cuda = ep::CUDA::default();
if cuda.register(&builder).is_err() {
eprintln!("Failed to register CUDA!");
std::process::exit(1);
}
let session = builder.commit_from_file("model.onnx")?;
Ok(())
}You can also check whether ONNX Runtime is even compiled with support for the execution provider with the is_available method.
use ort::{
ep::{self, ExecutionProvider},
session::Session
};
fn main() -> anyhow::Result<()> {
let builder = Session::builder()?;
let coreml = ep::CoreML::default();
if !coreml.is_available() {
eprintln!("Please compile ONNX Runtime with CoreML!");
std::process::exit(1);
}
// Note that even though ONNX Runtime was compiled with CoreML, registration could still fail!
coreml.register(&builder)?;
let session = builder.commit_from_file("model.onnx")?;
Ok(())
}Global EPs
You can configure EPs to be registered for all sessions created throughout the program by configuring the environment:
use ort::{ep, session::Session};
fn main() -> anyhow::Result<()> {
ort::init()
.with_execution_providers([ep::CUDA::default().build()])
.commit();
let session = Session::builder()?.commit_from_file("model.onnx")?;
// The session will attempt to register the CUDA EP
// since we configured environment EPs.
Ok(())
}Environment configs must be committed before you create any sessions, otherwise the configuration will not take effect!
EPs configured on a per-session basis (with SessionBuilder::with_execution_providers) will take precedence over environment EPs, but it won’t replace them.
Troubleshooting
If it seems like the execution provider is not registering properly, or you are not getting acceptable performance, see the Troubleshooting: Performance page for more information on how to debug any EP issues.
Notes
Dynamically-linked EP requirements
Certain EPs like CUDA and TensorRT use a separate interface that require them to be compiled as dynamic libraries which are loaded at runtime when the EP is registered. The DirectML and WebGPU EP do not use this interface, but do require helper dylibs.
Due to the quirks of dynamic library loading, you may encounter issues with builds including these EPs due to ONNX Runtime failing to find the dylibs at runtime. ort’s copy-dylibs Cargo feature (which is enabled by default) tries to alleviate this issue by symlinking these dylibs into your target folder so they can be found by your application when in development. On Windows platforms that don’t have Developer Mode enabled, a copy is instead performed. On other platforms, additional setup is required to get the application to load dylibs from its parent folder.
See Runtime dylib loading for more information.
Prebuilt binary combos
ort provides prebuilt binaries for the following combinations of EP features:
directml/xnnpack/coremlare available in any build if the platform supports it.cuda/tensorrtwebgpunvrtx
This means that we have builds for features = ["cuda", "tensorrt", "directml"], but not features = ["cuda", "webgpu"]. Specifying both cuda and webgpu for example will fall back to downloading a CPU-only build.
If you want a single build of ONNX Runtime that has both cuda and webgpu, you’ll have to compile it from source.
CUDA
ort provides binaries for CUDA ≥ 12.8 or ≥ 13.2, and targets cuDNN ≥ 9.19. Make sure CUDA and cuDNN are installed and available on the PATH.
ort will try to automatically detect which CUDA version you’re using, but sometimes it gets it wrong and reverts to CUDA 12 (especially if you have both 12 and 13 installed). You can override the CUDA version by setting the ORT_CUDA_VERSION environment variable to 12 or 13.
WebGPU
The WebGPU EP is experimental and may produce incorrect results/crashes; these issues should be reported upstream as there’s unfortunately nothing we can do about them.
WebGPU binaries are provided for Windows, macOS, and Linux. On Windows, DirectX 12 and 11 are supported. On Linux, Vulkan and OpenGL/GLES are supported.