IoUring: Reduce unnecessary io_uring_enter syscalls on non-blocking path#16259
Conversation
178a1ef to
092a32a
Compare
|
@dreamlike-ocean let me know once you did run some benchmarks |
|
Would be great while benchmarking, to collect a statistics of the different cases eg total in memory hit/sec, syscalls enter due to pending cqes or sqes to submit and eventually flame-graphs to make sure we have enough concurrent and small I/O |
|
Any news @dreamlike-ocean ? |
Sorry, I’m currently on vacation. I’ll move this forward once I’m back. |
|
@dreamlike-ocean enjoy your time off! |
|
Benchmark Results Test Setup
strace statistics for
Conclusion: No performance regression under high concurrency, but the optimization shows minimal effect. This is expected — under saturated IO,
To isolate and validate the effect of worker.forEach(Main::executorInfiniteTask);
private static void executorInfiniteTask(EventExecutor executor) {
executor.execute(() -> executor.execute(() -> executorInfiniteTask(executor)));
}strace statistics for
The remaining 65 calls in PATCH come from the acceptor thread (which does not have the infinite task) performing normal blocking waits. The OLD version's low3.4ms per call confirms these are empty submits ( Conclusion: On the non-blocking path with no IO activity, raw benchmark result
|
|
Looks great 😃 today will take a look more deeply |
|
@dreamlike-ocean thanks a lot! |
…ath (#16259) Motivation: In `IoUringIoHandler.run()`, the non-blocking path unconditionally calls `io_uring_enter` via `submitAndClearNow()` even when there are no pending SQEs and no deferred task work to flush. Modification: - Enable `IORING_SETUP_TASKRUN_FLAG` when `IORING_SETUP_DEFER_TASKRUN` is set, so the kernel signals `IORING_SQ_TASKRUN` when deferred completions are pending. Result: Fixes #16247. No regression under high concurrency load. Under IO-idle non-blocking path scenario, `io_uring_enter` calls reduced from 486 to 65 (-86.6%). --------- Co-authored-by: Norman Maurer <[email protected]> (cherry picked from commit bedd0ac)
|
Auto-port PR for 5.0: #16397 |
… non-blocking path (#16397) Auto-port of #16259 to 5.0 Cherry-picked commit: bedd0ac --- Motivation: In `IoUringIoHandler.run()`, the non-blocking path unconditionally calls `io_uring_enter` via `submitAndClearNow()` even when there are no pending SQEs and no deferred task work to flush. Modification: - Enable `IORING_SETUP_TASKRUN_FLAG` when `IORING_SETUP_DEFER_TASKRUN` is set, so the kernel signals `IORING_SQ_TASKRUN` when deferred completions are pending. Result: Fixes #16247. No regression under high concurrency load. Under IO-idle non-blocking path scenario, `io_uring_enter` calls reduced from 486 to 65 (-86.6%). Co-authored-by: Mengyang Li <[email protected]> Co-authored-by: Norman Maurer <[email protected]>
Motivation:
In
IoUringIoHandler.run(), the non-blocking path unconditionally callsio_uring_enterviasubmitAndClearNow()even when there are no pending SQEs and no deferred task work to flush.Modification:
IORING_SETUP_TASKRUN_FLAGwhenIORING_SETUP_DEFER_TASKRUNis set,so the kernel signals
IORING_SQ_TASKRUNwhen deferred completions are pending.Result:
Fixes #16247.
No regression under high concurrency load. Under IO-idle non-blocking
path scenario,
io_uring_entercalls reduced from 486 to 65 (-86.6%).