Skip to content

TRT EP failed to create model session with CUDA custom op #12282

@toothache

Description

@toothache

Describe the bug
TRT EP failed to run model with CUDA custom op.

Urgency
none.

System information

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Ubuntu 18.04, Win10
  • ONNX Runtime installed from (source or binary): source
  • ONNX Runtime version: 1.10.0
  • Python version: 3.8
  • Visual Studio version (if applicable): VS2019
  • GCC/Compiler version (if compiling from source): 8.4
  • CUDA/cuDNN version: 11.4 / 8.2
  • GPU model and memory: V100

To Reproduce

  • Describe steps/code to reproduce the behavior.
    The following code creates ORT session for the model with custom op.
#include <onnxruntime/core/session/onnxruntime_cxx_api.h>

#include <array>
#include <iostream>
#include <string>

using namespace std;

namespace Ort
{
    // The struct represents implementation of PackImageToSeqOp operator kernel
    struct PackImageToSeqKernel
    {
        inline PackImageToSeqKernel(CustomOpApi ort, const OrtKernelInfo* info) : ort_(ort)
        {
            margin_width_ = ort_.KernelInfoGetAttribute<int64_t>(info, "margin_width");
        }

        inline void GetOutputShape(OrtKernelContext* /*context*/, size_t /*output_index*/, OrtTensorTypeAndShapeInfo* /*info*/) {}

        void Compute(OrtKernelContext* /*context*/) { }

    private:
        CustomOpApi ort_;
        int64_t margin_width_;
    };

    // The struct represents implementation of PackImageToSeqOp operator
    struct PackImageToSeqOp : CustomOpBase<PackImageToSeqOp, PackImageToSeqKernel>
    {
        PackImageToSeqOp(const char* provider) : provider_(provider) {}

        inline void* CreateKernel(CustomOpApi api, const OrtKernelInfo* info) const { return new PackImageToSeqKernel(api, info); }
        inline const char* GetName() const { return "PackImageToSeq"; }
        const char* GetExecutionProviderType() const { return provider_; }

        static constexpr std::array<ONNXTensorElementDataType, 2> c_inputTypes =
        {
            // input type might be float/double/half precision values
            ONNX_TENSOR_ELEMENT_DATA_TYPE_UNDEFINED, // input[0]: input feature, 1*C*H*W
            ONNX_TENSOR_ELEMENT_DATA_TYPE_INT32,     // input[1]: sequence lengths for each image
        };

        inline size_t GetInputTypeCount() const { return c_inputTypes.size(); }
        inline ONNXTensorElementDataType GetInputType(size_t index) const { return c_inputTypes[index]; }

        static constexpr std::array<ONNXTensorElementDataType, 1> c_outputTypes =
        {
            // same with input type
            ONNX_TENSOR_ELEMENT_DATA_TYPE_UNDEFINED, // output[0]: output feature, N*C*H*MaxSeqW
        };

        inline size_t GetOutputTypeCount() const { return 1; }
        inline ONNXTensorElementDataType GetOutputType(size_t /*index*/) const { return ONNX_TENSOR_ELEMENT_DATA_TYPE_UNDEFINED; }

    private:
        const char* provider_;
    };
}  // namespace Ort::contrib

int main()
{
    Ort::Env env = Ort::Env{ORT_LOGGING_LEVEL_ERROR, "Default"};
    Ort::SessionOptions sessionOptions;

    sessionOptions.AppendExecutionProvider_TensorRT({});
    sessionOptions.AppendExecutionProvider_CUDA({});

    std::unique_ptr<Ort::PackImageToSeqOp> packImageToSeqOp(new Ort::PackImageToSeqOp("CUDAExecutionProvider"));

    Ort::CustomOpDomain customOpDomain("com.microsoft.oneocr");
    customOpDomain.Add(packImageToSeqOp.get());
    sessionOptions.Add(customOpDomain);

#ifdef _MSC_VER
    Ort::Session session(env, L"model.onnx", sessionOptions);
#else
    Ort::Session session(env, "model.onnx", sessionOptions);
#endif

    cout << "Create session successfully" << endl;

    return 0;
}
  • Attach the ONNX model to the issue (where applicable) to expedite investigation.
    model.zip

Expected behavior
We expect that the session will be created successfully. The custom op will fall back to CUDA EP and other operators are using TRT EP.

Screenshots
The application crashes with the following stacktrace.

2022-07-14 00:03:57.6057213 [E:onnxruntime:, inference_session.cc:1449 onnxruntime::InferenceSession::Initialize::<lambda_7d2791fa692be73b9f61f97a634cdaa3>::operator ()] Exception during initialization: D:\Source\cognition\onnxruntime\onnxruntime\core\providers\tensorrt\tensorrt_execution_provider.cc:890 onnxruntime::TensorrtExecutionProvider::GetSupportedList graph_build.Resolve().IsOK() was false.
Stacktrace:
D:\Source\cognition\onnxruntime\onnxruntime\core\providers\shared_library\provider_bridge_provider.cc(414): onnxruntime::GetStackTrace
D:\Source\cognition\onnxruntime\onnxruntime\core\providers\tensorrt\tensorrt_execution_provider.cc(890): onnxruntime::TensorrtExecutionProvider::GetSupportedList
D:\Source\cognition\onnxruntime\onnxruntime\core\providers\tensorrt\tensorrt_execution_provider.cc(1100): onnxruntime::TensorrtExecutionProvider::GetCapability
D:\Source\cognition\onnxruntime\onnxruntime\core\framework\graph_partitioner.cc(200): onnxruntime::PartitionOnnxFormatModelImpl
D:\Source\cognition\onnxruntime\onnxruntime\core\framework\graph_partitioner.cc(373): onnxruntime::GraphPartitioner::PartitionOnnxFormatModel
D:\Source\cognition\onnxruntime\onnxruntime\core\framework\graph_partitioner.cc(539): onnxruntime::GraphPartitioner::Partition
D:\Source\cognition\onnxruntime\onnxruntime\core\session\inference_session.cc(918): onnxruntime::InferenceSession::TransformGraph
D:\Source\cognition\onnxruntime\onnxruntime\core\session\inference_session.cc(1352): onnxruntime::InferenceSession::Initialize
D:\Source\cognition\onnxruntime\onnxruntime\core\session\onnxruntime_c_api.cc(688): `anonymous namespace'::InitializeSession
D:\Source\cognition\onnxruntime\onnxruntime\core\session\onnxruntime_c_api.cc(704): OrtApis::CreateSession
D:\Source\cognition\onnxruntime\build\install\include\onnxruntime\core\session\onnxruntime_cxx_inline.h(542): Ort::Session::Session
C:\Users\tooth\Desktop\ort_trt\main.cpp(52): main
D:\agent\_work\10\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl(79): invoke_main
D:\agent\_work\10\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl(288): __scrt_common_main_seh
D:\agent\_work\10\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl(331): __scrt_common_main
D:\agent\_work\10\s\src\vctools\crt\vcstartup\src\startup\exe_main.cpp(17): mainCRTStartup
???: BaseThreadInitThunk
???: RtlUserThreadStart

Additional context
Add any other context about the problem here. If the issue is about a particular model, please share the model details as well to facilitate debugging.

Metadata

Metadata

Assignees

No one assigned

    Labels

    ep:TensorRTissues related to TensorRT execution providerstaleissues that have not been addressed in a while; categorized by a bot

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions