Skip to content

Question/Bug: Unexpected -EAGAIN returned in CQE res field #766

@kelleymh

Description

@kelleymh

Consider the simple case of submitting a readv operation via io_uring_enter(), with no polling or other special flags. If the readv op is unable to immediately obtain a blk-mq tag in blk_mq_submit_bio(), it doesn’t wait. Control returns to io_queue_sqe(), which calls io_queue_async() to retry the operation. io_uring_enter() then completes normally. An iou-wrk-XXXX thread resubmits the I/O and waits for a tag as necessary. Eventually the readv completes normally and the CQE res field indicates success.

Now consider the more complex case where blk_mq_submit_bio() splits the bio because the transfer size exceeds the limits for the target device. If all bio’s making up the original request fail to immediately get a tag, then the same retry procedures are used. An iou-wrk-XXXX thread resubmits the I/O and waits for tags as necessary, and eventually the readv completes normally.

But if some of the split bio’s get a tag immediately while others do not, the retry doesn’t happen. For the bio’s that get a tag, those chunks of the original I/O are completed. But the overall readv returns -EAGAIN in the CQE res field because of the bio’s that couldn’t immediately get a tag.

Is this expected behavior? It seems desirable for io_uring code to retry the readv in this case like in the other cases, rather than requiring user space to do the retry. But I’m just learning about io_uring and how it interacts with the blk-mq layer, and I don’t see a straightforward way to fix this.

Thoughts?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions