Inference error with custom train

OS: Linux 24.04

Docker container: nvcr.io/nvidia/deepstream:8.0-gc-triton-devel

Tao Version:

i’m run efficientdet notebook and train a custom dataset, but on the moment run inference returns a errors in two situation:

First - Running the .engine generated for TAO:
gstnvtracker: Loading low-level lib at /opt/nvidia/deepstream/deepstream-8.0/lib/libnvds_nvmultiobjecttracker.so
[NvMultiObjectTracker] Initialized
Stack trace (most recent call last) in thread 176:
#31 Object “/usr/lib/x86_64-linux-gnu/libgstreamer-1.0.so.0”, at 0x7e898191fc87, in gst_element_change_state
#30 Object “/usr/lib/x86_64-linux-gnu/libgstreamer-1.0.so.0”, at 0x7e898191fc43, in gst_element_change_state
#29 Object “/usr/lib/x86_64-linux-gnu/libgstreamer-1.0.so.0”, at 0x7e898194b43e, in
#28 Object “/usr/lib/x86_64-linux-gnu/libgstreamer-1.0.so.0”, at 0x7e89818f75f7, in
#27 Object “/usr/lib/x86_64-linux-gnu/libgstreamer-1.0.so.0”, at 0x7e89819204f8, in
#26 Object “/usr/lib/x86_64-linux-gnu/libgstreamer-1.0.so.0”, at 0x7e898191fc43, in gst_element_change_state
#25 Object “/usr/lib/x86_64-linux-gnu/libgstreamer-1.0.so.0”, at 0x7e8981920b22, in
#24 Object “/usr/lib/x86_64-linux-gnu/libgstreamer-1.0.so.0”, at 0x7e89819209ad, in
#23 Object “/usr/lib/x86_64-linux-gnu/libgstreamer-1.0.so.0”, at 0x7e89819a245d, in
#22 Object “/usr/lib/x86_64-linux-gnu/libgstreamer-1.0.so.0”, at 0x7e898192ccb4, in gst_iterator_fold
#21 Object “/usr/lib/x86_64-linux-gnu/libgstreamer-1.0.so.0”, at 0x7e898192084a, in
#20 Object “/usr/lib/x86_64-linux-gnu/libgstreamer-1.0.so.0”, at 0x7e898193bcb4, in gst_pad_set_active
#19 Object “/usr/lib/x86_64-linux-gnu/libgstreamer-1.0.so.0”, at 0x7e898193b2ef, in
#18 Object “/usr/lib/x86_64-linux-gnu/libgstbase-1.0.so.0”, at 0x7e89818758b6, in
#17 Object “/usr/lib/x86_64-linux-gnu/libgstbase-1.0.so.0”, at 0x7e89818756a3, in
#16 Object “/usr/lib/x86_64-linux-gnu/gstreamer-1.0/deepstream/libnvdsgst_infer.so”, at 0x7e899b4a2e15, in
#15 Object “///opt/nvidia/deepstream/deepstream-8.0/lib/libnvds_infer.so”, at 0x7e899b15eb7d, in createNvDsInferContext(INvDsInferContext**, _NvDsInferContextInitParams&, void*, void ()(INvDsInferContext, unsigned int, NvDsInferLogLevel, char const*, void*))
#14 Object “///opt/nvidia/deepstream/deepstream-8.0/lib/libnvds_infer.so”, at 0x7e899b15e141, in nvdsinfer::NvDsInferContextImpl::initialize(_NvDsInferContextInitParams&, void*, void ()(INvDsInferContext, unsigned int, NvDsInferLogLevel, char const*, void*))
#13 Object “///opt/nvidia/deepstream/deepstream-8.0/lib/libnvds_infer.so”, at 0x7e899b15613d, in nvdsinfer::NvDsInferContextImpl::generateBackendContext(_NvDsInferContextInitParams&)
#12 Object “///opt/nvidia/deepstream/deepstream-8.0/lib/libnvds_infer.so”, at 0x7e899b15541c, in nvdsinfer::NvDsInferContextImpl::deserializeEngineAndBackend(std::__cxx11::basic_string<char, std::char_traits, std::allocator >, int, std::shared_ptrnvdsinfer::TrtEngine&, std::unique_ptr<nvdsinfer::BackendContext, std::default_deletenvdsinfer::BackendContext >&)
#11 Object “///opt/nvidia/deepstream/deepstream-8.0/lib/libnvds_infer.so”, at 0x7e899b182dd8, in
#10 Object “/usr/lib/x86_64-linux-gnu/libnvinfer.so.10”, at 0x7e8939bec4cd, in
#9 Object “/usr/lib/x86_64-linux-gnu/libnvinfer.so.10”, at 0x7e8939bec435, in
#8 Object “/usr/lib/x86_64-linux-gnu/libnvinfer.so.10”, at 0x7e8939beb9f2, in
#7 Object “/usr/lib/x86_64-linux-gnu/libnvinfer.so.10”, at 0x7e8939bb4b6d, in
#6 Object “/usr/lib/x86_64-linux-gnu/libnvinfer.so.10”, at 0x7e8939bb43eb, in
#5 Object “/usr/lib/x86_64-linux-gnu/libnvinfer.so.10”, at 0x7e8939bb404c, in
#4 Object “/usr/lib/x86_64-linux-gnu/libnvinfer.so.10”, at 0x7e893929712f, in
#3 Object “/usr/lib/x86_64-linux-gnu/libunwind.so.8”, at 0x7e89829151ac, in __libunwind_Unwind_Resume
#2 Object “/usr/lib/x86_64-linux-gnu/libstdc++.so.6”, at 0x7e89ac34d71b, in __gxx_personality_v0
#1 Object “/usr/lib/x86_64-linux-gnu/libstdc++.so.6”, at 0x7e89ac34d590, in
#0 Object “/usr/lib/x86_64-linux-gnu/libstdc++.so.6”, at 0x7e89ac34d210, in
Segmentation fault (Signal sent by the kernel [(nil)])
/opt/nvidia/deepstream/deepstream-8.0/entrypoint.sh: line 15: 41 Segmentation fault (core dumped) /opt/nvidia/nvidia_entrypoint.sh $@

Second - Running .onnx generated for TAO:

gstnvtracker: Loading low-level lib at /opt/nvidia/deepstream/deepstream-8.0/lib/libnvds_nvmultiobjecttracker.so
[NvMultiObjectTracker] Initialized
WARNING: ../nvdsinfer/nvdsinfer_model_builder.cpp:1261 Deserialize engine failed because file path: /app/Inference/configs/deepstream-app/models/tao-test/efficientdet-d0.fp16.engine open error
0:00:21.323347765 41 0x7919e8bdbc90 WARN nvinfer gstnvinfer.cpp:682:gst_nvinfer_logger: NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:2097> [UID = 1]: deserialize engine from file :/app/Inference/configs/deepstream-app/models/tao-test/efficientdet-d0.fp16.engine failed
0:00:21.323379299 41 0x7919e8bdbc90 WARN nvinfer gstnvinfer.cpp:682:gst_nvinfer_logger: NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2202> [UID = 1]: deserialize backend context from engine from file :/app/Inference/configs/deepstream-app/models/tao-test/efficientdet-d0.fp16.engine failed, try rebuild
0:00:21.323399979 41 0x7919e8bdbc90 INFO nvinfer gstnvinfer.cpp:685:gst_nvinfer_logger: NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:2123> [UID = 1]: Trying to create engine from model files
INFO: Dispositivo 19443010A15D842F00 conectado com sucesso.
INFO: Usando preset de sensor: 1080p @ 60 FPS (1920x1080 @ 58.0 FPS)
INFO: Usando preset de corte: 640x640 (1:1) (640x640)
INFO: Modo de Stream OAK: Bruto (NV12)
AVISO: Tentativa de atualizar controle desconhecido: update_crop_size_for_ds
Connected cameras: [<CameraBoardSocket.CAM_A: 0>, <CameraBoardSocket.CAM_B: 1>, <CameraBoardSocket.CAM_C: 2>]
INFO: Configurando modo ‘Esticar’. Redimensionando de 1920x1080 para 640x640.
INFO: Profundidade desabilitada. Pulando a criação dos nós estéreo.
INFO: Aplicando configurações iniciais da câmera…
INFO: Pipeline V3 iniciado no dispositivo.
WARNING: [TRT]: onnxOpImporters.cpp:6521: Attribute class_agnostic not found in plugin node! Ensure that the plugin creator has a default value defined or the engine may fail to build.

the zip file contain TAO notebook, Pipeline and files for inference.

tao-test.zip (56.3 MB)

The inference test on TAO notebook after train runing correctly, we can see the images results bellow:

Hi, Anyone can help me?

Moving to TAO forum.

The inference result is correct when you run inference with the tf model.
How about run inference against the tensorrt engine? i.e., What is the result in below cell?

I tried to use 2 way the custom model, first i use the onnx model converted for TAO on deepstream from gen trt engine and second i use trt engine gen from TAO. i can’t understand why dosen’t works, because after custom train on jupyter the model works correctly.

Hi @henrique.cerqueira ,
As mentioned above, to narrow down, could you please check if tao deploy efficientdet_tf2 inference xxx can generate expected results?
If this works, that means the tao pipeline works. Then we can continue to check the gap in deepstream side.