message from @Bill Broadley (continue)
I’ve not tried NFSv3 yet, does NFSv4 have a known problem with RDMA? The same pair of machines has NFSv4 working over GigE. I tried your command and:
dmesg -c
cat testfile
cat: testfile: Input/output error
rpcdebug -m rpc -s all
dmesg
[242031.688205] RPC: looking up Generic cred
[242031.688213] RPC: looking up Generic cred
[242031.688216] RPC: looking up Generic cred
[242031.688222] RPC: new task initialized, procpid 12386
[242031.688224] RPC: allocated task ffff880845c33000
[242031.688236] RPC: 110 __rpc_execute flags=0x81
[242031.688241] RPC: 110 call_start nfs4 proc OPEN (async)
[242031.688243] RPC: 110 call_reserve (status 0)
[242031.688246] RPC: 110 reserved req ffff880853255600 xid dfed959b
[242031.688249] RPC: 110 call_reserveresult (status 0)
[242031.688251] RPC: 110 call_refresh (status 0)
[242031.688253] RPC: 110 refreshing UNIX cred ffff881050ef20c0
[242031.688255] RPC: 110 call_refreshresult (status 0)
[242031.688257] RPC: 110 call_allocate (status 0)
[242031.688263] RPC: xprt_rdma_allocate: size 6344 too large for buffer[1024]: prog 100003 vers 4 proc 1
[242031.688269] RPC: xprt_rdma_allocate: size 6344, request 0xffff88082b81c000
[242031.688271] RPC: 110 call_bind (status 0)
[242031.688273] RPC: 110 call_connect xprt ffff880846752000 is connected
[242031.688275] RPC: 110 call_transmit (status 0)
[242031.688277] RPC: 110 xprt_prepare_transmit
[242031.688279] RPC: 110 xprt_cwnd_limited cong = 0 cwnd = 4096
[242031.688281] RPC: 110 rpc_xdr_encode (status 0)
[242031.688283] RPC: 110 marshaling UNIX cred ffff881050ef20c0
[242031.688286] RPC: 110 using AUTH_UNIX cred ffff881050ef20c0 to wrap rpc data
[242031.688290] RPC: 110 xprt_transmit(220)
[242031.688293] RPC: rpcrdma_inline_pullup: pad 0 destp 0xffff88082b81d83c len 220 hdrlen 220
[242031.688297] RPC: rpcrdma_register_frmr_external: Using frmr ffff880845ccda90 to map 1 segments
[242031.688301] RPC: rpcrdma_create_chunks: reply chunk elem 3124@0x82b81e3f4:0xe006730f (last)
[242031.688305] RPC: rpcrdma_marshal_req: reply chunk: hdrlen 48 rpclen 220 padlen 0 headerp 0xffff88082b81d100 base 0xffff88082b81d760 lkey 0x8000
[242031.688309] RPC: 110 xmit complete
[242031.688311] RPC: 110 sleep_on(queue “xprt_pending” time 4355352568)
[242031.688313] RPC: 110 added to queue ffff880846752258 “xprt_pending”
[242031.688315] RPC: 110 setting alarm for 60000 ms
[242031.688318] RPC: wake_up_first(ffff880846752190 “xprt_sending”)
[242031.688333] RPC: rpcrdma_event_process: event rep ffff880845ccda90 status 0 opcode 8 length 4294936584
[242031.688340] RPC: rpcrdma_event_process: event rep (null) status 0 opcode 0 length 4294936584
[242031.688473] RPC: rpcrdma_event_process: event rep ffff88085436a000 status 0 opcode 80 length 48
[242031.688479] RPC: rpcrdma_reply_handler: reply 0xffff88085436a000 completes request 0xffff88082b81c000
[242031.688479] RPC request 0xffff880853255600 xid 0x9b95eddf
[242031.688484] RPC: rpcrdma_count_chunks: chunk 308@0x82b81e3f4:0xe006730f
[242031.688486] RPC: rpcrdma_reply_handler: xprt_complete_rqst(0xffff880846752000, 0xffff880853255600, 308)
[242031.688489] RPC: 110 xid dfed959b complete (308 bytes received)
[242031.688492] RPC: 110 __rpc_wake_up_task (now 4355352568)
[242031.688493] RPC: 110 disabling timer
[242031.688496] RPC: 110 removed from queue ffff880846752258 “xprt_pending”
[242031.688500] RPC: __rpc_wake_up_task done
[242031.688544] RPC: 110 __rpc_execute flags=0x881
[242031.688547] RPC: 110 call_status (status 308)
[242031.688549] RPC: 110 call_decode (status 308)
[242031.688551] RPC: 110 validating UNIX cred ffff881050ef20c0
[242031.688554] RPC: 110 using AUTH_UNIX cred ffff881050ef20c0 to unwrap rpc data
[242031.688560] RPC: 110 call_decode result 0
[242031.688563] RPC: wake_up_first(ffff881052515e98 “NFSv4.0 transport Slot table”)
[242031.688566] RPC: 110 return 0, status 0
[242031.688568] RPC: 110 release task
[242031.688570] RPC: wake_up_first(ffff880846752190 “xprt_sending”)
[242031.688573] RPC: xprt_rdma_free: called on 0xffff88085436a000
[242031.688580] RPC: rpcrdma_event_process: event rep ffff880845ccda90 status 0 opcode 7 length 4294936584
[242031.688586] RPC: 110 release request ffff880853255600
[242031.688588] RPC: wake_up_first(ffff880846752320 “xprt_backlog”)
[242031.688590] RPC: rpc_release_client(ffff881052517200)
[242031.688604] RPC: 110 freeing task
[242031.688657] RPC: new task initialized, procpid 12386
[242031.688660] RPC: allocated task ffff88085145fb38
[242031.688672] RPC: 111 __rpc_execute flags=0x1
[242031.688678] RPC: 111 call_start nfs4 proc READ (async)
[242031.688681] RPC: 111 call_reserve (status 0)
[242031.688686] RPC: 111 reserved req ffff880853255600 xid e0ed959b
[242031.688689] RPC: 111 call_reserveresult (status 0)
[242031.688692] RPC: 111 call_refresh (status 0)
[242031.688695] RPC: 111 refreshing UNIX cred ffff881050ef20c0
[242031.688697] RPC: 111 call_refreshresult (status 0)
[242031.688699] RPC: 111 call_allocate (status 0)
[242031.688705] RPC: xprt_rdma_allocate: size 684, request 0xffff88082bea8000
[242031.688707] RPC: 111 call_bind (status 0)
[242031.688709] RPC: 111 call_connect xprt ffff880846752000 is connected
[242031.688711] RPC: 111 call_transmit (status 0)
[242031.688713] RPC: 111 xprt_prepare_transmit
[242031.688715] RPC: 111 xprt_cwnd_limited cong = 0 cwnd = 4096
[242031.688717] RPC: 111 rpc_xdr_encode (status 0)
[242031.688719] RPC: 111 marshaling UNIX cred ffff881050ef20c0
[242031.688721] RPC: 111 using AUTH_UNIX cred ffff881050ef20c0 to wrap rpc data
[242031.688724] RPC: 111 xprt_transmit(152)
[242031.688727] RPC: rpcrdma_inline_pullup: pad 0 destp 0xffff88082bea97f8 len 152 hdrlen 152
[242031.688731] RPC: rpcrdma_register_frmr_external: Using frmr ffff880845ccd040 to map 1 segments
[242031.688734] RPC: rpcrdma_create_chunks: write chunk elem 4096@0x8460d0000:0xe006310f (more)
[242031.688737] RPC: rpcrdma_register_frmr_external: Using frmr ffff880845ccda68 to map 1 segments
[242031.688739] RPC: rpcrdma_event_process: event rep ffff880845ccd040 status 0 opcode 8 length 4294936584
[242031.688744] RPC: rpcrdma_create_chunks: write chunk elem 152@0x82bea9974:0xe006720d (last)
[242031.688747] RPC: rpcrdma_marshal_req: write chunk: hdrlen 68 rpclen 152 padlen 0 headerp 0xffff88082bea9100 base 0xffff88082bea9760 lkey 0x8000
[242031.688749] RPC: rpcrdma_event_process: event rep ffff880845ccda68 status 0 opcode 8 length 4294936584