Trouble Deploying YOLOv4 in Graph Composer

Environment Details:
Hardware Platform (Jetson / GPU): dGPU (x86_64)
DeepStream Version: 7.1
JetPack Version (valid for Jetson only): N/A
TensorRT Version: 10.3.0.26
NVIDIA GPU Driver Version: 560.35.05
Issue Type: Question

Hello, I’m trying to deploy a YOLOv4 model trained with NVIDIA TAO in Graph Composer, but I’m running into some issues and could really use some help.

I followed the TAO notebook for training and exporting the model. The TAO container automatically generated a config file when creating the .onnx file. I used this file to integrate the model into Graph Composer, but things didn’t go as expected.

The first issue I noticed is that Graph Composer ignored the pre-built TensorRT engine and recreated it from scratch instead. Then, when I tried to run the graph using execute_graph.sh, it failed, and I got this error:

(gxe:555929): GStreamer-WARNING **: 10:15:18.600: Failed to load plugin '/usr/lib/x86_64-linux-gnu/gstreamer-1.0/libgstlibav.so': libblas.so.3: cannot open shared object file: No such file or directory
2025-02-10 10:15:18.712 INFO  extensions/nvdsbase/nvds_scheduler.cpp@381: NvDsScheduler Pipeline running

0:00:00.675303594 555929 0x71516c004400 ERROR                nvinfer gstnvinfer.cpp:678:gst_nvinfer_logger:<nvinfer_bin_nvinfer> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::parseBoundingBox() <nvdsinfer_context_impl_output_parsing.cpp:60> [UID = 1]: Could not find output coverage layer for parsing objects
0:00:00.675315581 555929 0x71516c004400 ERROR                nvinfer gstnvinfer.cpp:678:gst_nvinfer_logger:<nvinfer_bin_nvinfer> NvDsInferContext[UID 1]: Error in NvDsInferContextImpl::fillDetectionOutput() <nvdsinfer_context_impl_output_parsing.cpp:736> [UID = 1]: Failed to parse bboxes
====================================================================================================
|                            GXF terminated unexpectedly                                           |
====================================================================================================
#01 /opt/nvidia/graph-composer/gxe(+0x92fa) [0x60d731e1e2fa]
#02 /opt/nvidia/graph-composer/gxe(+0x244da) [0x60d731e394da]
#03 /opt/nvidia/graph-composer/gxe(+0x247bc) [0x60d731e397bc]
#04 /lib/x86_64-linux-gnu/libc.so.6(+0x42520) [0x7151a1c42520]
#05 attach_metadata_detector(_GstNvInfer*, _GstMiniObject*, GstNvInferFrame&, NvDsInferDetectionOutput&, float) /usr/lib/x86_64-linux-gnu/gstreamer-1.0/deepstream/libnvdsgst_infer.so(_Z24attach_metadata_detectorP11_GstNvInferP14_GstMiniObjectR15GstNvInferFrameR24NvDsInferDetectionOutputf+0x87) [0x71519c6a12d7]
#06 /usr/lib/x86_64-linux-gnu/gstreamer-1.0/deepstream/libnvdsgst_infer.so(+0x1c1d8) [0x71519c6901d8]
#07 /lib/x86_64-linux-gnu/libglib-2.0.so.0(+0x84ab1) [0x7151a1b4aab1]
#08 /lib/x86_64-linux-gnu/libc.so.6(+0x94ac3) [0x7151a1c94ac3]
#09 /lib/x86_64-linux-gnu/libc.so.6(+0x126850) [0x7151a1d26850]
====================================================================================================
Minidump written to: /tmp/ec0d44a4-c6f3-45e6-5748b990-cd3d9d10.dmp
/opt/nvidia/graph-composer/execute_graph.sh: line 331: 555929 Segmentation fault      (core dumped) ${RUN_PREFIX} ${GXE_PATH} -app "${GRAPH_FILES}" -manifest "${MANIFEST_FILE}" ${RUN_POSTFIX}
*******************************************************************
End ds-t1.yaml
*******************************************************************
[INFO] Graph installation directory /tmp/ds.ds-t1 and manifest /tmp/ds.ds-t1/manifest.yaml retained

At this point, I’m not sure if the problem is.

Here’s the config file TAO generated when exporting the .onnx file:

nvifer_config.txt (200 Bytes)

And here’s the config file I used for Graph Composer:

config_infer_primary_yolo.txt (911 Bytes)

Has anyone successfully deployed a TAO-trained YOLOv4 model in Graph Composer? Any tips or things I should check?

Thanks in advance!

As to your nvinfer configuration file, you are using default postprocessing inside gst-nvinfer to do the postprocessing, we don’t think it will work as to our experience. Since we don’t know what kind of change you have done to the TAO yolov4 model, we can just provide the sample for the TAO pre-trained yolov4 model for your reference. deepstream_tao_apps/configs/nvinfer/yolov4_tao/pgie_yolov4_tao_config.txt at release/tao5.1_ds6.4ga · NVIDIA-AI-IOT/deepstream_tao_apps

Hi, I tried different configurations. The first time, I just followed the notebook. Then, I tried generating the engines with a batch size of 1, as the Test 1 Composer graph uses by default.

  • The configuration for the .onnx file I used:
# Generate .onnx file using tao container
!tao model yolo_v4 export -m $USER_EXPERIMENT_DIR/experiment_dir_retrain/weights/yolov4_resnet18_epoch_$EPOCH.hdf5 \
                    -o $USER_EXPERIMENT_DIR/export/yolov4_resnet18_epoch_$EPOCH.onnx \
                    -e $SPECS_DIR/yolo_v4_retrain_resnet18_kitti.txt \
                    --target_opset 12 \
                    --gen_ds_config
  • The engine file configuration (default settings):
# Convert to TensorRT engine (FP16). 
!tao deploy yolo_v4 gen_trt_engine -m $USER_EXPERIMENT_DIR/export/yolov4_resnet18_epoch_$EPOCH.onnx \
                                   -e $SPECS_DIR/yolo_v4_retrain_resnet18_kitti.txt \
                                   --batch_size 16 \
                                   --min_batch_size 1 \
                                   --opt_batch_size 8 \
                                   --max_batch_size 16 \
                                   --data_type fp16 \
                                   --engine_file $USER_EXPERIMENT_DIR/export/trt.engine.fp16 \
                                   --results_dir $USER_EXPERIMENT_DIR/export
  • The engine file configuration with a different batch size:
# Convert to TensorRT engine (FP16). 
!tao deploy yolo_v4 gen_trt_engine -m $USER_EXPERIMENT_DIR/export/yolov4_resnet18_epoch_$EPOCH.onnx \
                                   -e $SPECS_DIR/yolo_v4_retrain_resnet18_kitti.txt \
                                   --batch_size 1 \
                                   --min_batch_size 1 \
                                   --opt_batch_size 8 \
                                   --max_batch_size 16 \
                                   --data_type fp16 \
                                   --engine_file $USER_EXPERIMENT_DIR/export/yolov4_resnet18_epoch_080.onnx_b1_gpu0_fp16.engine \
                                   --results_dir $USER_EXPERIMENT_DIR/export

So please modify the yolov4 configuration and postprocessing according to the onnx model you generated.

It worked! I can now run YOLOv4 in Graph Composer. The missing piece was the custom library libnvds_infercustomparser_tao.so

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.