DOCA Example App Runtime Error: errno=UNKNOWN-errno14

Hi, I’m trying to run the gpu_packet_processing example described in the DOCA GPUNetIO Install Page .

My eventual target is the system described in another forum post: DOCA gpu_packet_processing runtime error, but for now, I’m trying to run the example on a normal x86_64 tower PC with and NVIDIA RTX A4500 GPU and a ConnectX-6 Dx NIC.

My PCIe bus looks like this:

 +-[0000:50]-+-00.0  Intel Corporation Device 09a2
 |           +-00.1  Intel Corporation Device 09a4
 |           +-00.2  Intel Corporation Device 09a3
 |           +-00.4  Intel Corporation Device 0998
 |           +-02.0-[51]--+-00.0  Mellanox Technologies MT2892 Family [ConnectX-6 Dx]
 |           |            \-00.1  Mellanox Technologies MT2892 Family [ConnectX-6 Dx]
 |           \-04.0-[52]--+-00.0  NVIDIA Corporation GA102GL [RTX A4500]
 |                        \-00.1  NVIDIA Corporation GA102 High Definition Audio Controller

Per the guide, I have the latest versions of CUDA (12.5) and the “open” flavor of the NVIDIA drivers (version 555).

I’ve also set up the ConnectX adapter in Ethernet mode, disabled ACS in BIOS, enabled resizeable BAR1, and set up hugepages.

I’ve run the app with some debug logging enabled and was hoping someone would be able to point me in the right direction. I’m mostly getting errors like:

devx adapter 0x55ffaca23c70: Failed to register user memory. Got errno=UNKNOWN-errno14 (14)
Failed to create umem with dmabuf_fd with exception:
DOCA exception [DOCA_ERROR_DRIVER] with message priv_doca_umem constructor failed

Here’s the full output:

# ./doca_gpu_packet_processing -n 51:00.0 -g 52:00.0 -q 1 -l 60 --sdk-log-level 50
[22:34:55:548040][1211][DOCA][INF][gpu_packet_processing.c:284][main] ===========================================================
[22:34:55:548095][1211][DOCA][INF][gpu_packet_processing.c:285][main] DOCA version: 2.7.0085
[22:34:55:548098][1211][DOCA][INF][gpu_packet_processing.c:286][main] ===========================================================
[22:34:55:548123][1211][DOCA][INF][gpu_packet_processing.c:307][main] Options enabled:
        GPU 52:00.0
        NIC 51:00.0
        GPU Rx queues 1
        GPU HTTP server enabled No
[22:34:55:830769][1211][DOCA][INF][doca_dev.cpp:578][doca_devinfo_create_list] Devinfo list 0x5570f5528098: Added device=0x5570f55271d0 to devinfo list
[22:34:55:830790][1211][DOCA][INF][doca_dev.cpp:578][doca_devinfo_create_list] Devinfo list 0x5570f5528098: Added device=0x5570f5528310 to devinfo list
[22:34:55:830794][1211][DOCA][INF][doca_dev.cpp:587][doca_devinfo_create_list] Devinfo list 0x5570f5528098 was created
[22:34:55:837860][1211][DOCA][INF][doca_dev.cpp:1003][doca_dev_open] Local device 0x5570f55271d0 was opened
[22:34:55:837879][1211][DOCA][INF][doca_dev.cpp:146][dev_put] Device 0x5570f5528310 was destroyed
[22:34:55:837891][1211][DOCA][INF][doca_dev.cpp:668][doca_devinfo_destroy_list] Devinfo list 0x5570f5528098 was destroyed
EAL: Detected CPU lcores: 20
EAL: Detected NUMA nodes: 1
EAL: Detected shared linkage of DPDK
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
EAL: Selected IOVA mode 'PA'
EAL: VFIO support initialized
TELEMETRY: No legacy callbacks, legacy socket not created
EAL: Probe PCI driver: mlx5_pci (15b3:101d) device: 0000:51:00.0 (socket 0)
EAL: Probe PCI driver: gpu_cuda (10de:2232) device: 0000:52:00.0 (socket 0)
[22:34:56:036111][1211][DOCA][INF][doca_sub_dev.cpp:45][priv_doca_sub_dev_gpu_ops_set] sub_dev: gpu_ops was set to 0x7fcc771dfd20
[22:34:56:663305][1211][DOCA][WRN][engine_model.c:90][adapt_queue_depth] adapting queue depth to 128.
[22:34:56:663332][1211][DOCA][INF][engine_model.c:151][engine_model_init] engine model defined with mode=vnf
[22:34:56:663342][1211][DOCA][INF][engine_model.c:152][engine_model_init] engine model defined with nr_pipe_queues=1
[22:34:56:663352][1211][DOCA][INF][engine_model.c:153][engine_model_init] engine model defined with pipe_queue_depth=128
[22:34:56:663360][1211][DOCA][INF][engine_model.c:155][engine_model_init] engine model defined in isolated mode
[22:34:56:663369][1211][DOCA][INF][engine_model.c:156][engine_model_init] engine model defined RSS with nr_queues=0
[22:34:56:663377][1211][DOCA][INF][engine_model.c:157][engine_model_init] engine model defined with nr_counters=524228
[22:34:56:663386][1211][DOCA][INF][engine_model.c:158][engine_model_init] engine model defined with nr_meters=0
[22:34:56:663395][1211][DOCA][INF][engine_model.c:159][engine_model_init] engine model defined with nr_acl_collisions=3
[22:34:56:663982][1211][DOCA][INF][engine_field_mapping.c:109][engine_field_mapping_init] Engine field mapping initialized
[22:34:56:663992][1211][DOCA][INF][engine_shared_resources.c:155][engine_shared_resources_init] Engine shared resources initialized successfully
[22:34:56:664321][1211][DOCA][INF][dpdk_port.c:853][dpdk_port_module_init] dpdk port module init
[22:34:56:664331][1211][DOCA][INF][dpdk_table.c:1353][dpdk_table_module_init] Initializing dpdk table successfully
[22:34:56:664339][1211][DOCA][INF][dpdk_flow.c:59][dpdk_flow_module_init] Initializing dpdk flow successfully
[22:34:56:664349][1211][DOCA][INF][dpdk_resource_manager.c:210][dpdk_resource_manager_module_init] Dpdk resource manager register completed
[22:34:56:710311][1211][DOCA][INF][dpdk_pipe_items.c:210][dpdk_pipe_items_module_init] Initialized dpdk pipe items module
[22:34:56:710338][1211][DOCA][INF][dpdk_pipe_geneve_opt.c:125][dpdk_pipe_geneve_opt_module_init] Initialized dpdk pipe GENEVE options module
[22:34:56:710427][1211][DOCA][INF][dpdk_pipe.c:203][dpdk_pipe_module_init] Dpdk pipe initialized successfully
[22:34:56:710456][1211][DOCA][INF][dpdk_layer.c:150][dpdk_layer_register] Dpdk layer register completed
[22:34:56:710636][1211][DOCA][INF][doca_flow_match.c:694][doca_flow_match_init] Doca flow match UDS initialized
[22:34:56:710885][1211][DOCA][INF][doca_flow_actions.c:1209][doca_flow_actions_init] Doca flow actions UDS initialized
[22:34:56:710903][1211][DOCA][INF][doca_flow_monitor.c:202][doca_flow_monitor_init] Doca flow monitor UDS initialized
[22:34:56:710907][1211][DOCA][INF][doca_flow_layer.c:94][doca_flow_layer_init] Doca flow layer initialized
[22:34:56:710911][1211][DOCA][INF][doca_flow.c:617][doca_flow_init] Doca flow initialized successfully
[22:34:56:710976][1211][DOCA][INF][utils_hash_table.c:119][utils_hash_table_create] hash table a_tmplt_t port 0 created
[22:34:56:710986][1211][DOCA][INF][utils_hash_table.c:119][utils_hash_table_create] hash table p_tmplt_t port 0 created
[22:34:56:711051][1211][DOCA][INF][utils_hash_table.c:119][utils_hash_table_create] hash table dpdk_tbl_mgr port 0 created
[22:34:56:711530][1211][DOCA][INF][dpdk_meter_profiles.c:202][dpdk_meter_profiles_create] Created meter profiles on port 0 with 2 caches, 128 profiles
[22:34:57:522200][1211][DOCA][INF][dpdk_port.c:1033][dpdk_port_create] Dpdk port 0 initialized successfully with 2 queues
[22:34:57:563033][1211][DOCA][INF][doca_flow.c:1536][doca_flow_port_start] doca flow port with id=0 started
[22:34:57:563099][1211][DOCA][INF][doca_pe.cpp:46][doca_pe_create] Progress engine 0x5570f566cb80 was created
[22:34:57:563113][1211][DOCA][INF][udp_queues.c:45][create_udp_queues] Creating UDP Eth Rxq 0
[22:34:57:563140][1211][DOCA][INF][doca_eth_rxq.c:1955][doca_eth_rxq_set_type] ETH_RXQ 0x5570f566cc30: queue_type was set to DOCA_ETH_RXQ_TYPE_CYCLIC
[22:34:57:563160][1211][DOCA][INF][doca_mmap.cpp:558][doca_mmap_create] Mmap 0x5570f566d0c0 was created, access_mask=0x1
[22:34:57:563796][1211][DOCA][INF][udp_queues.c:121][create_udp_queues] Mapping receive queue buffer (0x0x7fc3ac000000 size 536870912B dmabuf fd 134) with dmabuf mode
[22:34:57:563818][1211][DOCA][INF][doca_mmap.cpp:1846][doca_mmap_set_dmabuf_memrange] Mmap 0x5570f566d0c0: Set dmabuf_memrange.
[22:34:57:563831][1211][DOCA][INF][doca_mmap.cpp:1950][doca_mmap_set_permissions] Mmap 0x5570f566d0c0: Set permissions with access_mask=0x41
[22:34:57:568642][1211][DOCA][INF][doca_mmap.cpp:741][doca_mmap_start] Mmap 0x5570f566d0c0: mmap was started
[22:34:57:568656][1211][DOCA][INF][doca_eth_rxq.c:2642][doca_eth_rxq_set_pkt_buf] ETH_RXQ 0x5570f566cc30: mmap was set to 0x5570f566d0c0
[22:34:57:568662][1211][DOCA][INF][doca_eth_rxq.c:2643][doca_eth_rxq_set_pkt_buf] ETH_RXQ 0x5570f566cc30: offset was set to 0
[22:34:57:568666][1211][DOCA][INF][doca_eth_rxq.c:2644][doca_eth_rxq_set_pkt_buf] ETH_RXQ 0x5570f566cc30: size was set to 536870912
[22:34:57:568673][1211][DOCA][INF][doca_ctx.cpp:203][doca_ctx_start] CTX 0x5570f566cc30 does not require PE
[22:34:57:571139][1211][DOCA][INF][doca_buf_array.cpp:218][doca_buf_arr] buf_arr 0x5570f559c3e0 was created
[22:34:57:571157][1211][DOCA][INF][doca_buf_array.cpp:219][doca_buf_arr]   num_elem=65536
[22:34:57:571162][1211][DOCA][INF][doca_buf_array.cpp:286][set_target_gpu] buf_arr 0x5570f559c3e0: target_gpu was set to 0x5570f55a1490
[22:34:57:571167][1211][DOCA][INF][doca_buf_array.cpp:236][set_params] buf_arr 0x5570f559c3e0: elem_size was set to 8192
[22:34:57:571171][1211][DOCA][INF][doca_buf_array.cpp:237][set_params] buf_arr 0x5570f559c3e0: start_offset was set to 0
[22:34:57:572543][1211][DOCA][INF][doca_buf_array.cpp:414][start] buf_arr 0x5570f559c3e0: buf_arr was started
[22:34:57:573155][1211][DOCA][INF][doca_uar.cpp:207][bridge_init] UAR 0x5570f566f6c0 created: page=0x7fcc684b2000, reg_addr=0x7fcc684b2800, base_addr=0x7fcc684b2000, id=133, alloc_type=BLUEFLAME
[22:34:57:574980][1211][DOCA][ERR][linux_devx_adapter.cpp:322][umem_reg] devx adapter 0x5570f55281c0: Failed to register user memory. Got errno=UNKNOWN-errno14 (14)
[22:34:57:575072][1211][DOCA][ERR][doca_umem.cpp:256][bridge_init_dmabuf] Failed to register dmabuf umem with exception:
[22:34:57:575089][1211][DOCA][ERR][doca_umem.cpp:256][bridge_init_dmabuf] DOCA exception [DOCA_ERROR_DRIVER] with message Failed to register user memory
[22:34:57:575108][1211][DOCA][ERR][doca_umem.cpp:131][priv_doca_umem_create_dmabuf] Failed to create umem with dmabuf_fd with exception:
[22:34:57:575119][1211][DOCA][ERR][doca_umem.cpp:131][priv_doca_umem_create_dmabuf] DOCA exception [DOCA_ERROR_DRIVER] with message priv_doca_umem constructor failed
[22:34:57:577291][1211][DOCA][INF][eth_rxq_common.c:451][eth_rxq_common_create_cq] ETH_RXQ 0x5570f566cc30: Created CQ 0x49a
[22:34:57:577439][1211][DOCA][ERR][linux_devx_adapter.cpp:322][umem_reg] devx adapter 0x5570f55281c0: Failed to register user memory. Got errno=UNKNOWN-errno14 (14)
[22:34:57:577466][1211][DOCA][ERR][doca_umem.cpp:256][bridge_init_dmabuf] Failed to register dmabuf umem with exception:
[22:34:57:577482][1211][DOCA][ERR][doca_umem.cpp:256][bridge_init_dmabuf] DOCA exception [DOCA_ERROR_DRIVER] with message Failed to register user memory
[22:34:57:577495][1211][DOCA][ERR][doca_umem.cpp:131][priv_doca_umem_create_dmabuf] Failed to create umem with dmabuf_fd with exception:
[22:34:57:577505][1211][DOCA][ERR][doca_umem.cpp:131][priv_doca_umem_create_dmabuf] DOCA exception [DOCA_ERROR_DRIVER] with message priv_doca_umem constructor failed
[22:34:57:579415][1211][DOCA][INF][eth_rxq_common.c:718][eth_rxq_common_create_rq] ETH_RXQ 0x5570f566cc30: Created RQ 0xc0004b
[22:34:57:579559][1211][DOCA][ERR][linux_devx_adapter.cpp:322][umem_reg] devx adapter 0x5570f55281c0: Failed to register user memory. Got errno=UNKNOWN-errno14 (14)
[22:34:57:579585][1211][DOCA][ERR][doca_umem.cpp:256][bridge_init_dmabuf] Failed to register dmabuf umem with exception:
[22:34:57:579597][1211][DOCA][ERR][doca_umem.cpp:256][bridge_init_dmabuf] DOCA exception [DOCA_ERROR_DRIVER] with message Failed to register user memory
[22:34:57:579608][1211][DOCA][ERR][doca_umem.cpp:131][priv_doca_umem_create_dmabuf] Failed to create umem with dmabuf_fd with exception:
[22:34:57:579617][1211][DOCA][ERR][doca_umem.cpp:131][priv_doca_umem_create_dmabuf] DOCA exception [DOCA_ERROR_DRIVER] with message priv_doca_umem constructor failed
[22:34:57:580479][1211][DOCA][INF][eth_rxq_common.c:451][eth_rxq_common_create_cq] ETH_RXQ 0x5570f566cc30: Created CQ 0x49b
[22:34:57:580546][1211][DOCA][ERR][linux_devx_adapter.cpp:322][umem_reg] devx adapter 0x5570f55281c0: Failed to register user memory. Got errno=UNKNOWN-errno14 (14)
[22:34:57:580559][1211][DOCA][ERR][doca_umem.cpp:256][bridge_init_dmabuf] Failed to register dmabuf umem with exception:
[22:34:57:580569][1211][DOCA][ERR][doca_umem.cpp:256][bridge_init_dmabuf] DOCA exception [DOCA_ERROR_DRIVER] with message Failed to register user memory
[22:34:57:580578][1211][DOCA][ERR][doca_umem.cpp:131][priv_doca_umem_create_dmabuf] Failed to create umem with dmabuf_fd with exception:
[22:34:57:580589][1211][DOCA][ERR][doca_umem.cpp:131][priv_doca_umem_create_dmabuf] DOCA exception [DOCA_ERROR_DRIVER] with message priv_doca_umem constructor failed
[22:34:57:583586][1211][DOCA][INF][doca_qp.cpp:986][priv_doca_dev_qp_create] Device 0x5570f55271d0: qp=0x5570f566a780 was created
[22:34:57:585903][1211][DOCA][INF][doca_qp.cpp:948][set_state] QP 0x5570f566a780: State change INIT -> CONNECTED
[22:34:57:585935][1211][DOCA][INF][eth_rxq_common.c:299][eth_rxq_common_create_flush_qp] ETH_RXQ 0x5570f566cc30: Created flush QP 0xab
[22:34:57:590187][1211][DOCA][ERR][linux_devx_adapter.cpp:322][umem_reg] devx adapter 0x5570f55281c0: Failed to register user memory. Got errno=UNKNOWN-errno14 (14)
[22:34:57:590227][1211][DOCA][ERR][doca_umem.cpp:256][bridge_init_dmabuf] Failed to register dmabuf umem with exception:
[22:34:57:590243][1211][DOCA][ERR][doca_umem.cpp:256][bridge_init_dmabuf] DOCA exception [DOCA_ERROR_DRIVER] with message Failed to register user memory
[22:34:57:590263][1211][DOCA][ERR][doca_dev.cpp:2773][priv_doca_dev_mapped_memory_region_create_dmabuf] Failed to create mapped memory region with dmabuf: failed to allocate memory for mr with exception:
[22:34:57:590276][1211][DOCA][ERR][doca_dev.cpp:2773][priv_doca_dev_mapped_memory_region_create_dmabuf] DOCA exception [DOCA_ERROR_DRIVER] with message priv_doca_umem constructor failed
[22:34:57:590283][1211][DOCA][ERR][cb_ops.cpp:395][cb_doca_gpu_export_eth_rxq] Failed allocating Flush QP mem region
[22:34:57:590318][1211][DOCA][ERR][doca_eth_rxq.c:1665][eth_rxq_start_gpu_ctx] ETH_RXQ 0x5570f566cc30: Failed to start eth_rxq: unable to export eth_rxq to GPU. err=DOCA_ERROR_DRIVER
[22:34:57:592435][1211][DOCA][INF][eth_rxq_common.c:618][eth_rxq_common_destroy_rq] ETH_RXQ 0x5570f566cc30: Destroyed RQ 0xc0004b
[22:34:57:593352][1211][DOCA][INF][eth_rxq_common.c:524][eth_rxq_common_destroy_cq] ETH_RXQ 0x5570f566cc30: Destroyed CQ 0x49a
[22:34:57:593367][1211][DOCA][INF][doca_qp.cpp:999][priv_doca_dev_qp_destroy] Destroying qp=0x5570f566a780
[22:34:57:594825][1211][DOCA][INF][eth_rxq_common.c:174][eth_rxq_common_destroy_flush_qp] ETH_RXQ 0x5570f566cc30: Destroyed flush QP 0xab
[22:34:57:595397][1211][DOCA][INF][eth_rxq_common.c:524][eth_rxq_common_destroy_cq] ETH_RXQ 0x5570f566cc30: Destroyed CQ 0x49b
[22:34:57:595466][1211][DOCA][INF][doca_buf_array.cpp:427][stop] buf_arr 0x5570f559c3e0: buf_arr was stopped
[22:34:57:595472][1211][DOCA][ERR][doca_ctx.cpp:227][doca_ctx_start] Failed to start context 0x7ffff3f05ce8 with status DOCA_ERROR_DRIVER
[22:34:57:595477][1211][DOCA][ERR][udp_queues.c:180][create_udp_queues] Failed doca_ctx_start: DOCA Driver call failure
[22:34:57:595482][1211][DOCA][INF][udp_queues.c:267][destroy_udp_queues] Destroying UDP queue 0
[22:34:57:595487][1211][DOCA][INF][doca_pe.cpp:114][priv_doca_pe_ctx_destroy] Destroying progress engine ctx=0x5570f566cc30
[22:34:57:595493][1211][DOCA][INF][doca_mmap.cpp:575][doca_mmap_destroy] Mmap 0x5570f566d0c0: Destroying mmap
[22:34:57:595644][1211][DOCA][INF][doca_mmap.cpp:762][doca_mmap_stop] Mmap 0x5570f566d0c0: mmap was stopped
[22:34:57:595708][1211][DOCA][ERR][gpu_packet_processing.c:350][main] Function create_udp_queues returned Bad State

Please let me know if anything sticks out to your or there are other details I could provide. Thank you!

Same thing here. It works on another machine but got the same error on one of the machines. It has resizeable bar though…

got the same issue here, are you using bluefield2 as NIC mode here? How is your interface look like? I can’t find any bf’s interface in ifconfig. Thanks

Hi @brivia

Actually, that problem was caused by the nvidia_peermem module not being loaded.

So you should

modprobe nvidia_peermem

However I did not get much further on that machine → DOCA GPUNetIO does not receive packets on some machine

Thanks for your information. I tried reload the module but still failed with “errorno14”.Thanks a lot, would try to find what happened with the module.

I am also running into this error when trying to run the doca_gpunetio_rdma_client_server_write sample application. Here is my output with some additional debug enabled.

root@SYS-540A-TR:/opt/mellanox/doca/samples/doca_gpunetio/gpunetio_rdma_client_server_write/build# ./doca_gpunetio_rdma_client_server_write -l 70 --sdk-log-level 70 -gpu 52:00.0 -d mlx5_0
[18:54:15:756826][1412][DOCA][INF][gpunetio_rdma_client_server_write_main.c:250][main] Starting the sample
[18:54:15:756979][1412][DOCA][INF][doca_dev.cpp:579][doca_devinfo_create_list] Devinfo list 0x55cc8aa66e08: Added device=0x55cc8aa52a50 to devinfo list
[18:54:15:756986][1412][DOCA][INF][doca_dev.cpp:579][doca_devinfo_create_list] Devinfo list 0x55cc8aa66e08: Added device=0x55cc8aa52410 to devinfo list
[18:54:15:756989][1412][DOCA][INF][doca_dev.cpp:588][doca_devinfo_create_list] Devinfo list 0x55cc8aa66e08 was created
[18:54:15:777394][1412][DOCA][INF][doca_dev.cpp:1004][doca_dev_open] Local device 0x55cc8aa52a50 was opened
[18:54:15:777401][1412][DOCA][INF][doca_dev.cpp:147][dev_put] Device 0x55cc8aa52410 was destroyed
[18:54:15:777409][1412][DOCA][INF][doca_dev.cpp:669][doca_devinfo_destroy_list] Devinfo list 0x55cc8aa66e08 was destroyed
EAL: Detected CPU lcores: 40
EAL: Detected NUMA nodes: 1
EAL: Detected shared linkage of DPDK
EAL: Selected IOVA mode 'PA'
EAL: No free 2048 kB hugepages reported on node 0
EAL: VFIO support initialized
TELEMETRY: No legacy callbacks, legacy socket not created
[18:54:16:042771][1412][DOCA][DBG][doca_gpunetio.cpp:145][doca_gpu_create] GPU needs flush 1
EAL: Probe PCI driver: gpu_cuda (10de:2232) device: 0000:52:00.0 (socket 0)
[18:54:16:151384][1412][DOCA][INF][doca_sub_dev.cpp:45][priv_doca_sub_dev_gpu_ops_set] sub_dev: gpu_ops was set to 0x7f838e7f3d20
[18:54:16:152205][1412][DOCA][DBG][priv_doca_rdma.cpp:322][priv_doca_rdma_get_max_send_data_segs] devinfo 0x55cc8aa52ab0: max_send_data_segs=64 for transport_type=RC
[18:54:16:152429][1412][DOCA][DBG][priv_doca_rdma.cpp:322][priv_doca_rdma_get_max_send_data_segs] devinfo 0x55cc8aa52ab0: max_send_data_segs=16 for transport_type=DC
[18:54:16:152643][1412][DOCA][DBG][priv_doca_rdma.cpp:281][priv_doca_rdma_get_max_recv_data_segs] devinfo 0x55cc8aa52ab0: max_recv_data_segs=32
[18:54:16:152979][1412][DOCA][INF][doca_rdma.cpp:267][doca_rdma_create] RDMA=0x55cc8b791380 was created, dev=0x55cc8aa52a50
[18:54:16:152984][1412][DOCA][INF][doca_rdma.cpp:1694][doca_rdma_set_permissions] RDMA 0x55cc8b791380: permissions were set to 0x5
[18:54:16:152987][1412][DOCA][INF][doca_rdma.cpp:1482][doca_rdma_set_send_queue_size] RDMA 0x55cc8b791380: send_queue_size was set to 8192
[18:54:16:152993][1412][DOCA][INF][doca_rdma.cpp:1532][doca_rdma_set_recv_queue_size] RDMA 0x55cc8b791380: recv_queue_size was set to 8192
[18:54:16:152996][1412][DOCA][INF][doca_rdma.cpp:1722][doca_rdma_set_grh_enabled] RDMA 0x55cc8b791380: grh_enabled was set to 1
[18:54:16:153001][1412][DOCA][DBG][doca_ctx.cpp:197][adjust_ctx_ops_to_data_path] CTX 0x55cc8b791380: ctx GPU ops are empty
[18:54:16:153006][1412][DOCA][INF][doca_ctx.cpp:253][doca_ctx_start] CTX 0x55cc8b791380 does not require PE
[18:54:16:153275][1412][DOCA][INF][doca_uar.cpp:207][bridge_init] UAR 0x55cc8b814800 created: page=0x7f838e680000, reg_addr=0x7f838e680800, base_addr=0x7f838e680000, id=259, alloc_type=NONCACHE_DEDICATED
[18:54:16:153989][1412][DOCA][DBG][linux_devx_obj.cpp:68][priv_doca_devx_object] DEVX obj 0x55cc8b696330 created (m_devx_object=0x55cc8b811ab0)
[18:54:16:154607][1412][DOCA][DBG][linux_devx_obj.cpp:68][priv_doca_devx_object] DEVX obj 0x55cc8b816b20 created (m_devx_object=0x55cc8b816c70)
[18:54:16:155168][1412][DOCA][DBG][linux_devx_obj.cpp:68][priv_doca_devx_object] DEVX obj 0x55cc8aa66aa0 created (m_devx_object=0x55cc8b817b70)
[18:54:16:155339][1412][DOCA][DBG][doca_qp.cpp:1056][set_state] QP 0x55cc8b816a60: State is already INIT
[18:54:16:155343][1412][DOCA][INF][doca_qp.cpp:1087][priv_doca_dev_qp_create] Device 0x55cc8aa52a50: qp=0x55cc8b816a60 was created
[18:54:16:155398][1412][DOCA][INF][doca_uar.cpp:207][bridge_init] UAR 0x55cc8b811730 created: page=0x7f838e680000, reg_addr=0x7f838e680900, base_addr=0x7f838e680000, id=259, alloc_type=NONCACHE_DEDICATED
[18:54:16:156240][1412][DOCA][DBG][linux_devx_obj.cpp:68][priv_doca_devx_object] DEVX obj 0x55cc8b818260 created (m_devx_object=0x55cc8b8193f0)
[18:54:16:156287][1412][DOCA][ERR][linux_devx_adapter.cpp:322][umem_reg] devx adapter 0x55cc8aa67c40: Failed to register user memory. Got errno=UNKNOWN-errno14 (14)
[18:54:16:156336][1412][DOCA][ERR][doca_umem.cpp:256][bridge_init_dmabuf] Failed to register dmabuf umem with exception:
[18:54:16:156346][1412][DOCA][ERR][doca_umem.cpp:256][bridge_init_dmabuf] DOCA exception [DOCA_ERROR_DRIVER] with message Failed to register user memory
[18:54:16:156358][1412][DOCA][ERR][doca_umem.cpp:131][priv_doca_umem_create_dmabuf] Failed to create umem with dmabuf_fd with exception:
[18:54:16:156367][1412][DOCA][ERR][doca_umem.cpp:131][priv_doca_umem_create_dmabuf] DOCA exception [DOCA_ERROR_DRIVER] with message priv_doca_umem constructor failed
[18:54:16:156745][1412][DOCA][DBG][linux_devx_obj.cpp:68][priv_doca_devx_object] DEVX obj 0x55cc8b8181a0 created (m_devx_object=0x55cc8b8183a0)
[18:54:16:156906][1412][DOCA][DBG][doca_qp.cpp:1056][set_state] QP 0x55cc8b818200: State is already INIT
[18:54:16:156910][1412][DOCA][INF][doca_qp.cpp:1087][priv_doca_dev_qp_create] Device 0x55cc8aa52a50: qp=0x55cc8b818200 was created
[18:54:16:157422][1412][DOCA][INF][doca_qp.cpp:1049][set_state] QP 0x55cc8b818200: State change INIT -> CONNECTED
[18:54:16:157456][1412][DOCA][ERR][linux_devx_adapter.cpp:322][umem_reg] devx adapter 0x55cc8aa67c40: Failed to register user memory. Got errno=UNKNOWN-errno14 (14)
[18:54:16:157464][1412][DOCA][ERR][doca_umem.cpp:256][bridge_init_dmabuf] Failed to register dmabuf umem with exception:
[18:54:16:157469][1412][DOCA][ERR][doca_umem.cpp:256][bridge_init_dmabuf] DOCA exception [DOCA_ERROR_DRIVER] with message Failed to register user memory
[18:54:16:157478][1412][DOCA][ERR][doca_dev.cpp:2833][priv_doca_dev_mapped_memory_region_create_dmabuf] Failed to create mapped memory region with dmabuf: failed to allocate memory for mr with exception:
[18:54:16:157484][1412][DOCA][ERR][doca_dev.cpp:2833][priv_doca_dev_mapped_memory_region_create_dmabuf] DOCA exception [DOCA_ERROR_DRIVER] with message priv_doca_umem constructor failed
[18:54:16:157910][1412][DOCA][DBG][linux_devx_obj.cpp:68][priv_doca_devx_object] DEVX obj 0x55cc8b818140 created (m_devx_object=0x55cc8b8182e0)
[18:54:16:157914][1412][DOCA][INF][doca_dev.cpp:2759][priv_doca_dev_mapped_memory_region_create_pg_sz] Device 0x55cc8aa52a50: mapped_memory_region=0x55cc8b8198d0 was created
[18:54:16:157919][1412][DOCA][INF][priv_doca_rdma_gpu.cpp:291][priv_doca_rdma_gpu_create_flush_qp] RDMA 0x55cc8b791380: Created flush QP 0x0x55cc8b818200
[18:54:16:158583][1412][DOCA][DBG][cb_ops.cpp:45][internal_gpu_set_invalid_cqe] Set to invalid GPU CQE addr 0x7f835aa00000 CQE num 16384
[18:54:16:158608][1412][DOCA][DBG][cb_ops.cpp:85][internal_gpu_set_flush_wqe] Prepared Flush QP WQE addr 0x7f835a502000 num 512 size 32768

[18:54:16:158800][1412][DOCA][DBG][cb_ops.cpp:45][internal_gpu_set_invalid_cqe] Set to invalid GPU CQE addr 0x7f835a400000 CQE num 16384
[18:54:16:158990][1412][DOCA][DBG][cb_ops.cpp:45][internal_gpu_set_invalid_cqe] Set to invalid GPU CQE addr 0x7f835a600000 CQE num 16384
[18:54:16:159016][1412][DOCA][ERR][rdma_common.c:64][oob_connection_server_setup] Socket created successfully
[18:54:16:159023][1412][DOCA][INF][rdma_common.c:83][oob_connection_server_setup] Done with binding
[18:54:16:159029][1412][DOCA][INF][rdma_common.c:91][oob_connection_server_setup] Listening for incoming connections

I am using DOCA 2.8. Please let me know if anything sticks out to your or there are other details I could provide. Thank you!

This is not related to GPUNetIO, it’s a bug in DOCA 2.8 Core library we found and fixed in recent DOCA 2.9. To make dmabuf working, you need CUDA driver installed with the open kernel AND a linux kernel that is >= 6.2. If you meet only one of the requirements (CUDA driver in open mode) then it will fail.

To make it work with DOCA 2.8, please just reinstall CUDA without the open flag and use nvidia-peermem to map the UAR

Hi, Cuda tool kit version 12.6 used and driver 560.35.03. how do i check if it is installed with open kernel ?
Doca 2.9 is used.

I still have issue in doca_gpu_dmabuf_fd.

Can you report the error you see? How did you install CUDA? Via .deb or .run?
If it’s via .deb you need to follow the open kernel module flavor instruction

@eagostini is nvidia-peermem and dmabuf exclusive? if dmabuf is there we dont need nvidia-peermem right?

Correct. In DOCA, we give priority to dmabuf and if it doesn’t work (warnings printed on the console) then it fallback to nvidia-peermem

@eagostini i am getting memory allocation failure when i am trying to run doca samples app ( DOCA 3.0)

AS-5126GS-TNRT2:/tmp/doca$ sudo /opt/mellanox/doca/samples/doca_gpunetio/gpunetio_rdma_client_server_write/build/doca_gpunetio_rdma_client_server_write -d rocep133s0 -gpu e3:00.0  -l 70 --sdk-log-level 70
[12:06:40:748248][2843851][DOCA][INF][gpunetio_rdma_client_server_write_main.c:461][main] Starting the sample
[12:06:40:748691][2843851][DOCA][INF][doca_dev.cpp:622][doca_devinfo_create_list] Devinfo list 0x589f5c3c3598: Added device=0x589f5c3b7cd0 to devinfo list
[12:06:40:748705][2843851][DOCA][INF][doca_dev.cpp:631][doca_devinfo_create_list] Devinfo list 0x589f5c3c3598 was created
[12:06:40:754740][2843851][DOCA][INF][linux_devx_adapter.cpp:96][open] devx adapter 0x589f5c3bec40: opened ibv_ctx 0x589f5c3bec80
[12:06:40:755396][2843851][DOCA][INF][doca_dev.cpp:993][doca_dev_open] Local device 0x589f5c3b7cd0 was opened
[12:06:40:755405][2843851][DOCA][INF][doca_dev.cpp:712][doca_devinfo_destroy_list] Devinfo list 0x589f5c3c3598 was destroyed
[12:06:41:894003][2843851][DOCA][DBG][doca_gpunetio.cpp:149][doca_gpu_create] GPU needs flush 0
[12:06:41:894026][2843851][DOCA][INF][doca_sub_dev.cpp:45][priv_doca_sub_dev_gpu_ops_set] sub_dev: gpu_ops was set to 0x727635980cc0
[12:06:41:894232][2843851][DOCA][DBG][priv_doca_rdma.cpp:356][priv_doca_rdma_get_max_send_data_segs] devinfo 0x589f5c3b7d30: max_send_data_segs=64 for transport_type=RC
[12:06:41:894242][2843851][DOCA][DBG][priv_doca_rdma.cpp:356][priv_doca_rdma_get_max_send_data_segs] devinfo 0x589f5c3b7d30: max_send_data_segs=16 for transport_type=DC
[12:06:41:894245][2843851][DOCA][DBG][priv_doca_rdma.cpp:315][priv_doca_rdma_get_max_recv_data_segs] devinfo 0x589f5c3b7d30: max_recv_data_segs=32
[12:06:41:894672][2843851][DOCA][INF][doca_rdma.cpp:262][doca_rdma_create] RDMA=0x727634cde010 was created, dev=0x589f5c3b7cd0
[12:06:41:894678][2843851][DOCA][INF][doca_rdma.cpp:1773][doca_rdma_set_permissions] RDMA 0x727634cde010: permissions were set to 0x5
[12:06:41:894680][2843851][DOCA][INF][doca_rdma.cpp:1567][doca_rdma_set_send_queue_size] RDMA 0x727634cde010: send_queue_size was set to 8192
[12:06:41:894685][2843851][DOCA][INF][doca_rdma.cpp:1611][doca_rdma_set_recv_queue_size] RDMA 0x727634cde010: recv_queue_size was set to 8192
[12:06:41:894687][2843851][DOCA][INF][doca_rdma.cpp:1801][doca_rdma_set_grh_enabled] RDMA 0x727634cde010: grh_enabled was set to 1
[12:06:41:894690][2843851][DOCA][INF][doca_ctx.cpp:185][adjust_ctx_ops_to_data_path] CTX 0x727634cde010: ctx ops set to GPU ops
[12:06:41:894693][2843851][DOCA][INF][doca_ctx.cpp:245][doca_ctx_start] CTX 0x727634cde010 does not require PE
[12:06:41:894860][2843851][DOCA][INF][doca_uar.cpp:233][bridge_init] UAR 0x589f5c3eaf40 created: page=0x72763590c000, reg_addr=0x72763590c800, base_addr=0x72763590c000, id=259, alloc_type=NONCACHE_DEDICATED
[12:06:41:988945][2843851][DOCA][DBG][doca_gpunetio.cpp:347][doca_gpu_mem_alloc] New memory: Orig 727413e00000 GPU 727413e00000 CPU 0 type 0 size 1052672

[12:06:41:988987][2843851][DOCA][ERR][linux_devx_adapter.cpp:250][umem_reg] devx adapter 0x589f5c3bec40: Failed to register user memory. Got errno=UNKNOWN-errno14 (14)
[12:06:41:989046][2843851][DOCA][ERR][doca_umem.cpp:345][bridge_init] Failed to register umem with exception:
[12:06:41:989055][2843851][DOCA][ERR][doca_umem.cpp:345][bridge_init] DOCA exception [DOCA_ERROR_DRIVER] with message Failed to register user memory
[12:06:41:989064][2843851][DOCA][ERR][doca_umem.cpp:92][priv_doca_umem_create_pg_sz] Failed to create umem with page size with exception:
[12:06:41:989068][2843851][DOCA][ERR][doca_umem.cpp:92][priv_doca_umem_create_pg_sz] DOCA exception [DOCA_ERROR_DRIVER] with message priv_doca_umem constructor failed
[12:06:41:989073][2843851][DOCA][ERR][rdma_ctx_gpu.cpp:154][priv_doca_rdma_gpu_create_cq] RDMA 0x727634cde010: Failed to create UMEM GPU. err=DOCA_ERROR_DRIVER
[12:06:41:989190][2843851][DOCA][DBG][doca_gpunetio.cpp:347][doca_gpu_mem_alloc] New memory: Orig 727413e00000 GPU 727413e00000 CPU 0 type 0 size 1052672

[12:06:41:989198][2843851][DOCA][ERR][linux_devx_adapter.cpp:250][umem_reg] devx adapter 0x589f5c3bec40: Failed to register user memory. Got errno=UNKNOWN-errno14 (14)
[12:06:41:989202][2843851][DOCA][ERR][doca_umem.cpp:345][bridge_init] Failed to register umem with exception:
[12:06:41:989204][2843851][DOCA][ERR][doca_umem.cpp:345][bridge_init] DOCA exception [DOCA_ERROR_DRIVER] with message Failed to register user memory
[12:06:41:989207][2843851][DOCA][ERR][doca_umem.cpp:92][priv_doca_umem_create_pg_sz] Failed to create umem with page size with exception:
[12:06:41:989210][2843851][DOCA][ERR][doca_umem.cpp:92][priv_doca_umem_create_pg_sz] DOCA exception [DOCA_ERROR_DRIVER] with message priv_doca_umem constructor failed
[12:06:41:989212][2843851][DOCA][ERR][rdma_ctx_gpu.cpp:154][priv_doca_rdma_gpu_create_cq] RDMA 0x727634cde010: Failed to create UMEM GPU. err=DOCA_ERROR_DRIVER
[12:06:41:989295][2843851][DOCA][DBG][doca_gpunetio.cpp:347][doca_gpu_mem_alloc] New memory: Orig 727413e00000 GPU 727413e00000 CPU 0 type 0 size 135168

[12:06:41:989300][2843851][DOCA][ERR][linux_devx_adapter.cpp:250][umem_reg] devx adapter 0x589f5c3bec40: Failed to register user memory. Got errno=UNKNOWN-errno14 (14)
[12:06:41:989303][2843851][DOCA][ERR][doca_umem.cpp:345][bridge_init] Failed to register umem with exception:
[12:06:41:989306][2843851][DOCA][ERR][doca_umem.cpp:345][bridge_init] DOCA exception [DOCA_ERROR_DRIVER] with message Failed to register user memory
[12:06:41:989308][2843851][DOCA][ERR][doca_umem.cpp:92][priv_doca_umem_create_pg_sz] Failed to create umem with page size with exception:
[12:06:41:989311][2843851][DOCA][ERR][doca_umem.cpp:92][priv_doca_umem_create_pg_sz] DOCA exception [DOCA_ERROR_DRIVER] with message priv_doca_umem constructor failed
[12:06:41:989313][2843851][DOCA][ERR][rdma_ctx_gpu.cpp:662][priv_doca_rdma_gpu_create_rmp] Failed to create UMEM for RMP. err=DOCA_ERROR_DRIVER
[12:06:41:989358][2843851][DOCA][ERR][rdma_ctx_gpu.cpp:877][priv_doca_rdma_gpu_create_connection_objects] RDMA 0x727634cde010: Failed to create RMP. err=DOCA_ERROR_DRIVER
[12:06:41:989365][2843851][DOCA][ERR][rdma_ctx_gpu.cpp:1150][priv_doca_rdma_ctx_gpu_start] RDMA 0x727634cde010: Failed to start ctx: Failed to create connection objects. err=DOCA_ERROR_DRIVER
[12:06:41:989415][2843851][DOCA][ERR][doca_ctx.cpp:269][doca_ctx_start] Failed to start context 0x7fff310c5568 with status DOCA_ERROR_DRIVER
[12:06:41:989421][2843851][DOCA][ERR][rdma_common.c:428][create_rdma_resources] Failed to start RDMA context: DOCA Driver call failure
[12:06:41:989426][2843851][DOCA][INF][doca_pe.cpp:115][priv_doca_pe_ctx_destroy] Destroying progress engine ctx=0x727634cde010
[12:06:41:989430][2843851][DOCA][INF][doca_rdma.cpp:307][doca_rdma_destroy] RDMA 0x727634cde010: RDMA was destroyed
[12:06:41:990460][2843851][DOCA][INF][doca_dev.cpp:143][dev_put] Device 0x589f5c3b7cd0 was destroyed
[12:06:41:990467][2843851][DOCA][INF][doca_dev.cpp:1008][doca_dev_close] Local device 0x589f5c3b7cd0 was closed
[12:06:41:990469][2843851][DOCA][ERR][gpunetio_rdma_client_server_write_sample.c:571][rdma_write_server] Failed to allocate RDMA resources: DOCA Driver call failure
[12:06:41:990474][2843851][DOCA][ERR][gpunetio_rdma_client_server_write_main.c:495][main] rdma_write_server() failed: DOCA Driver call failure
[12:06:41:990484][2843851][DOCA][INF][gpunetio_rdma_client_server_write_main.c:514][main] Sample finished with errors

Below are my system specs

Fri Aug  1 12:08:45 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 575.64.03              Driver Version: 575.64.03      CUDA Version: 12.9     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA H200 NVL                On  |   00000000:E3:00.0 Off |                    0 |
| N/A   38C    P0             76W /  600W |      14MiB / 143771MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+
|   1  NVIDIA H200 NVL                On  |   00000000:E4:00.0 Off |                    0 |
| N/A   39C    P0             71W /  600W |      14MiB / 143771MiB |      0%      Default |
|                                         |                        |             Disabled |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A            4630      G   /usr/lib/xorg/Xorg                        4MiB |
|    1   N/A  N/A            4630      G   /usr/lib/xorg/Xorg                        4MiB |
+-----------------------------------------------------------------------------------------+


BOOT_IMAGE=/vmlinuz-6.15.0 root=/dev/mapper/ubuntu--vg-ubuntu--lv ro amd_iommu=off hugepagesz=1G hugepages=128 quiet splash vt.handoff=7


nvidia-smi -q | grep -i bar -A 3
    BAR1 Memory Usage
        Total                             : 262144 MiB
        Used                              : 2 MiB
        Free                              : 262142 MiB
--
    BAR1 Memory Usage
        Total                             : 262144 MiB
        Used                              : 2 MiB
        Free                              : 262142 MiB

85:00.0 Ethernet controller: Mellanox Technologies MT2910 Family [ConnectX-7]

Both GPUs and NIC are in same NUMA Node.


Do you have nvidia-peermem running?

No, since its 6.15 kernel i suppose dmabuf will be used.

That is correct. Then you can do two things:

  1. After running the app and getting the error, look at sudo dmesg log to spot any possible issue on your system
  2. Run nvidia-peermem anyway and see if it works with it

@eagostini i will check and update, one information did NVIDIA made some specific changes in Mellanox Driver to support GDAKI or upstream driver will work.
with DOCA profiles/packaging its little difficult to experiment on upstream kernel (say in my case 6.15).

Also is there any docker for experiment DOCA samples?

@eagostini one more question on dependency on the libs to build
i see the gpunetio example has dependency on dpdk

# Required DOCA Driver
sample_dependencies += dependency('libdpdk')

what is the purpose of using dpdk here.