-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Closed
Labels
ep:CUDAissues related to the CUDA execution providerissues related to the CUDA execution provider
Description
Describe the issue
It works before, but after update to recent version it stop working
[E:onnxruntime:Default, provider_bridge_ort.cc:1862 TryGetProviderInfo_CUDA] /onnxruntime_src/onnxruntime/core/session/provider_bridge_ort.cc:1539 onnxruntime::Provider& onnxruntime::ProviderLibrary::Get() [ONNXRuntimeError] : 1 : FAIL : Failed to load library libonnxruntime_providers_cuda.so with error: libcudnn_adv.so.9: cannot open shared object file: No such file or directory
find ./venv -type f -name 'libcudnn_adv.so.9'
./venv/lib/python3.11/site-packages/nvidia/cudnn/lib/libcudnn_adv.so.9
To reproduce
pip install torch==2.6.0+cu124
pip install onnxruntime-gpu==1.20.1
then run following script:
import onnx
import onnx.helper as helper
import torch # this used to help ort find the libs but now this trick stop working
import onnxruntime as ort
import numpy as np
def test_onnxruntime_cuda_provider():
# 1. Create a simple ONNX model (addition of two inputs)
# Define input shapes and types
input_shape = [2, 2]
input_type = np.float32
# Create ONNX model
input1 = helper.make_tensor_value_info('input1', onnx.TensorProto.FLOAT, input_shape)
input2 = helper.make_tensor_value_info('input2', onnx.TensorProto.FLOAT, input_shape)
output = helper.make_tensor_value_info('output', onnx.TensorProto.FLOAT, input_shape)
# Define a simple addition node
add_node = helper.make_node('Add', inputs=['input1', 'input2'], outputs=['output'])
# Create the graph
graph = helper.make_graph([add_node], 'add_graph', [input1, input2], [output])
# Create the model
model = helper.make_model(
graph,
opset_imports=[helper.make_opsetid("", 21)]
)
# 2. Run inference using ONNX Runtime with CUDA provider
# Check available providers
available_providers = ort.get_available_providers()
assert 'CUDAExecutionProvider' in available_providers, "CUDAExecutionProvider is not available"
# Create an inference session with the CUDA provider
session = ort.InferenceSession(model.SerializeToString(), providers=['CUDAExecutionProvider'])
# Prepare input data
input_data1 = np.array([[1.0, 2.0], [3.0, 4.0]], dtype=input_type)
input_data2 = np.array([[5.0, 6.0], [7.0, 8.0]], dtype=input_type)
# Run inference
outputs = session.run(None, {'input1': input_data1, 'input2': input_data2})
# Verify the output
expected_output = np.array([[6.0, 8.0], [10.0, 12.0]], dtype=input_type)
np.testing.assert_allclose(outputs[0], expected_output, rtol=1e-5, atol=1e-5)
print("Test passed! Output matches expected result.")
if __name__ == "__main__":
test_onnxruntime_cuda_provider()
Urgency
Quite high. Now we cannot run ort with cuda provider at all.
Platform
Linux
OS Version
AlmaLinux 9.5
ONNX Runtime Installation
Released Package
ONNX Runtime Version or Commit ID
1.20.1
ONNX Runtime API
Python
Architecture
X64
Execution Provider
CUDA
Execution Provider Library Version
cuda 12.4.127
Metadata
Metadata
Assignees
Labels
ep:CUDAissues related to the CUDA execution providerissues related to the CUDA execution provider