-
Notifications
You must be signed in to change notification settings - Fork 3.7k
Closed
Description
Describe the bug
I recently encountered an issue after updating my onnx version from 1.7 (CUDA 11.0) to 1.83 (CUDA 11.0) or 1.10 (CUDA 11.4).
At random my program crashes when loading multiple onnx models using individual threads. This didn't happen with v1.7.
The error message is: Failed to add kernel for MemcpyFromHost: Conflict with existing kernel def hash (kernel_registry.cc line 305)
Urgency
Not urgent, since v1.7 is still running fine.
System information
- OS Platform and Distribution: Win 10
- ONNX Runtime installed from (source or binary): Source compiled - I had to change the target name to onnxruntime_CC, since Windows sometimes brings their own onnxruntime lib without cuda.
- ONNX Runtime version: v1.8.2 with cuda 11.0 and v1.10 with cuda 11.4
- Python version: -
- Visual Studio version (if applicable): Visual Studio 2017 (v141)
- GCC/Compiler version (if compiling from source): Visual Studio 2019 (v142)
- CUDA/cuDNN version: 11.0 and 11.4
- GPU model and memory: 2080Ti and 3070Ti
To Reproduce
- See below for example program and onnx model attached.
- ExampleModel
#include <iostream>
#include <string>
#include <thread>
#include <onnxruntime_cxx_api.h>
void load_graph(std::wstring file)
{
Ort::Session *m_session;
Ort::Env m_env;
Ort::SessionOptions sessionOptions;
Ort::ThrowOnError(OrtSessionOptionsAppendExecutionProvider_CUDA(sessionOptions, 0));
m_env = Ort::Env(ORT_LOGGING_LEVEL_ERROR, "Inference");
m_session = new Ort::Session(m_env, file.c_str(), sessionOptions);
std::cout << "Load graph." << std::endl;
}
int main(int argc, char* argv[])
{
try
{
std::wstring new_graph_file_name = L"D:\\example.onnx";
std::thread first (load_graph, new_graph_file_name);
std::thread second (load_graph, new_graph_file_name);
std::thread third (load_graph, new_graph_file_name);
load_graph(new_graph_file_name);
first.join();
second.join();
third.join();
}
catch (const std::exception& e)
{
std::cout << "Caught exception: " << e.what() << std::endl;
return 1;
}
catch (...)
{
std::cout << "Fatal error: Caught unknown exception!" << std::endl;
return 1;
}
std::cout << "Everything fine!" << std::endl;
return 0;
}
Expected behavior
Difficult to reproduce, but if you start this program multiple times (approx 1 in 10) it should crash at some point.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels
