linux: cap number of in-flight io_uring SQEs by bnoordhuis · Pull Request #4601 · libuv/libuv

bnoordhuis · 2024-11-09T10:47:10Z

Limit the number of in-flight submission entries to the size of the completion queue. The kernel's sqpoll thread hoovers them up really fast and we may end up posting way more submissions than fit in the completion queue.

That's not fatal as such, the kernel stores them in an overflow queue that we flush eventually, but that queue can grow pretty big if the program submits lots of file operations.

Salient questions are:

Is this change a bug fix? If yes, why? If not, why not?
Is this change a (de)optimization? Is the kernel's overflow queue more or less efficient than libuv's thread pool?

This commit removes the IORING_SQ_CQ_OVERFLOW handling that was added in commit 30fc896 ("unix: handle CQ overflow in iou ring") from May 2023 because it is no longer needed.

Fixes: #4598

Limit the number of in-flight submission entries to the size of the completion queue. The kernel's sqpoll thread hoovers them up really fast and we may end up posting way more submissions than fit in the completion queue. That's not fatal as such, the kernel stores them in an overflow queue that we flush eventually, but that queue can grow pretty big if the program submits lots of file operations. Salient questions are: 1. Is this change a bug fix? If yes, why? If not, why not? 2. Is this change a (de)optimization? Is the kernel's overflow queue more or less efficient than libuv's thread pool? This commit removes the IORING_SQ_CQ_OVERFLOW handling that was added in commit 30fc896 ("unix: handle CQ overflow in iou ring") from May 2023 because it is no longer needed. Fixes: libuv#4598

santigimeno · 2024-11-09T18:50:47Z

src/unix/linux.c

+  if (iou->in_flight >= iou->max_in_flight)
+    return NULL;


What about if we reach the condition we enter the ring to retrieve whatever completions are already there and pushed those to a pending_queue to make sure no request is completed synchronously and decrease iou->in_flight accordingly? Only if iou->in_flight reaches the limit with no completions left, would make us go through the threadpool.

It's probably because it's a sleepy Saturday night but I'm not quite sure what you're asking. To call io_uring_enter when we hit the limit?

I'm guessing you maybe intended to write s/What about/How about/?

Yes, sorry about not being clear, even incorrect. My thinking is, by the time the limit is hit it's very likely completions are already available, grab them and keep on submitting to the ring taking advantage of how fast sometimes those ops are performed. I said "enter" and that's not correct as we're not actually using io_uring_enter() to grab the completions (I conflated it with the io_uring_enter(IORING_ENTER_GETEVENTS) we were using to make more completions available when the OVERFLOW state was reached)

Right, I believe I'm following now. I don't think we can call uv__poll_io_uring() from here because that runs user callbacks.

Well, I guess we could if we retrieved the uv_fs_t pointer from the CQE and queued it somewhere, but... that's yet another queue :-)

Well, I guess we could if we retrieved the uv_fs_t pointer from the CQE and queued it somewhere, but... that's yet another queue :-)

Yes, that was what I was thinking when I mentioned about pushing completions to a pending_queue...
So, do you think it would make a difference? Yes, it's another queue, but we would be sending more requests to the ring than to the threadpool which should be have better performance?

Well... maybe. The counterargument is that it's default-disabled code so it's not going to get used or tested much. An easier fix is to make the SQ and CQ bigger; they're quite modestly sized now. I.e.:

diff --git a/src/unix/linux.c b/src/unix/linux.c index 279ca2c8..c63484d6 100644 --- a/src/unix/linux.c +++ b/src/unix/linux.c @@ -773,7 +773,7 @@ static struct uv__io_uring_sqe* uv__iou_get_sqe(struct uv__iou* iou, return NULL; } - uv__iou_init(loop->backend_fd, iou, 64, UV__IORING_SETUP_SQPOLL); + uv__iou_init(loop->backend_fd, iou, 1024, UV__IORING_SETUP_SQPOLL); if (iou->ringfd == -2) iou->ringfd = -1; /* "failed" */ }

So I've benchmarked this last change. And performance drops significantly in my laptop. I'm running the fs_stat benchmark with these changes. Could you check I'm not doing anything obviously wrong?

diff --git a/src/unix/linux.c b/src/unix/linux.c index 857a4ef8a..293cc1ce3 100644 --- a/src/unix/linux.c +++ b/src/unix/linux.c @@ -773,7 +773,7 @@ static struct uv__io_uring_sqe* uv__iou_get_sqe(struct uv__iou* iou, return NULL; } - uv__iou_init(loop->backend_fd, iou, 64, UV__IORING_SETUP_SQPOLL); + uv__iou_init(loop->backend_fd, iou, 1024, UV__IORING_SETUP_SQPOLL); if (iou->ringfd == -2) iou->ringfd = -1; /* "failed" */ } diff --git a/test/benchmark-fs-stat.c b/test/benchmark-fs-stat.c index c41062241..c9d1da07d 100644 --- a/test/benchmark-fs-stat.c +++ b/test/benchmark-fs-stat.c @@ -27,7 +27,8 @@ #define NUM_SYNC_REQS (10 * 1e5) #define NUM_ASYNC_REQS (1 * (int) 1e5) -#define MAX_CONCURRENT_REQS 32 +/* To be sure it overflows when ring size of 64 */ +#define MAX_CONCURRENT_REQS 128 #define sync_stat(req, path) \ do { \ @@ -129,6 +130,7 @@ static void async_bench(const char* path) { * out of the equation. */ BENCHMARK_IMPL(fs_stat) { + uv_loop_configure(uv_default_loop(), UV_LOOP_USE_IO_URING_SQPOLL); const char path[] = "."; warmup(path); sync_bench(path);

I ran fs_stat with 32 and 128 (in steps of 8) concurrent requests in the following configurations on x86_64 Linux 6.5.0-18-generic:

no io_uring, i.e., thread pool

io_uring the way it is in v1.x

io_uring + this pull request

io_uring + this pull request + 1024 sqes, 2048 cqes

For reqs=32, the results are uninteresting: the thread pool is around 50% slower, the other three are indistinguishable from each other.

For reqs=128, things get more interesting. At low concurrency rates (let's say < 32) results are much the same as for reqs=32 but at high concurrency rates (> 64):

(1) is still slower but closing the gap (15%)

(4) is 35% slower than (2) and (3) - I think that's what you are seeing to?

fs_stat stress-tests system call overhead, not disk I/O (only hits the dentry cache), so it seems we're falling off some io_uring performance cliff in (4).

santigimeno · 2024-11-09T19:04:20Z

Is this change a bug fix? If yes, why? If not, why not?

I'm not sure it's a bug fix. Probably it's equally difficult for the kernel to run out of memory than the libuv run application.

Is this change a (de)optimization? Is the kernel's overflow queue more or less efficient than libuv's thread pool?

From the documentation, it seems to be a deoptimization. Whether it's better or worse than going through the threadpool... that's a different story. We may want to run some benchmarks.

bnoordhuis · 2024-11-19T18:46:00Z

@santigimeno Santi, what's the conclusion here? Merge, not merge? I'm guessing the latter?

santigimeno · 2024-11-19T20:17:20Z

@santigimeno Santi, what's the conclusion here? Merge, not merge? I'm guessing the latter?

I tend to agree. I was planning to make a PoC of the queue solution but I don't know when I'll have the time.

bnoordhuis · 2024-11-19T21:31:33Z

Okay, thanks - I'll go ahead and close this and #4598. Feel free to reopen if necessary.

bnoordhuis requested a review from santigimeno November 9, 2024 10:47

bnoordhuis mentioned this pull request Nov 9, 2024

io_uring: possible issue handling CQ overflow #4598

Closed

santigimeno reviewed Nov 9, 2024

View reviewed changes

bnoordhuis closed this Nov 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

linux: cap number of in-flight io_uring SQEs#4601

linux: cap number of in-flight io_uring SQEs#4601
bnoordhuis wants to merge 1 commit intolibuv:v1.xfrom
bnoordhuis:fix4598

bnoordhuis commented Nov 9, 2024

Uh oh!

santigimeno Nov 9, 2024

Uh oh!

bnoordhuis Nov 9, 2024

Uh oh!

santigimeno Nov 9, 2024 •

edited

Loading

Uh oh!

bnoordhuis Nov 9, 2024

Uh oh!

santigimeno Nov 10, 2024

Uh oh!

bnoordhuis Nov 10, 2024

Uh oh!

santigimeno Nov 13, 2024

Uh oh!

bnoordhuis Nov 15, 2024

Uh oh!

santigimeno commented Nov 9, 2024 •

edited

Loading

Uh oh!

bnoordhuis commented Nov 19, 2024

Uh oh!

santigimeno commented Nov 19, 2024

Uh oh!

bnoordhuis commented Nov 19, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

bnoordhuis commented Nov 9, 2024

Uh oh!

santigimeno Nov 9, 2024

Choose a reason for hiding this comment

Uh oh!

bnoordhuis Nov 9, 2024

Choose a reason for hiding this comment

Uh oh!

santigimeno Nov 9, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bnoordhuis Nov 9, 2024

Choose a reason for hiding this comment

Uh oh!

santigimeno Nov 10, 2024

Choose a reason for hiding this comment

Uh oh!

bnoordhuis Nov 10, 2024

Choose a reason for hiding this comment

Uh oh!

santigimeno Nov 13, 2024

Choose a reason for hiding this comment

Uh oh!

bnoordhuis Nov 15, 2024

Choose a reason for hiding this comment

Uh oh!

santigimeno commented Nov 9, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bnoordhuis commented Nov 19, 2024

Uh oh!

santigimeno commented Nov 19, 2024

Uh oh!

bnoordhuis commented Nov 19, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

santigimeno Nov 9, 2024 •

edited

Loading

santigimeno commented Nov 9, 2024 •

edited

Loading