Skip to content

Auto-port 5.0: Fix race in io.netty.channel.uring.IoUringIoHandler.wakeup#16842

Merged
normanmaurer merged 1 commit into
5.0from
auto-port-pr-16836-to-5.0
May 21, 2026
Merged

Auto-port 5.0: Fix race in io.netty.channel.uring.IoUringIoHandler.wakeup#16842
normanmaurer merged 1 commit into
5.0from
auto-port-pr-16836-to-5.0

Conversation

@netty-project-bot
Copy link
Copy Markdown
Contributor

Auto-port of #16836 to 5.0
Cherry-picked commit: b022b47


Motivation:

Fixes a shutdown race in the io_uring transport where wakeup() can write to an eventfd after it has already been closed.

The race looks like this:

Thread T1 (non-eventloop)            Eventloop thread
-------------------------            ----------------
wakeup()
  getAndSet(true) -> false
  flag = true
  [preempted before eventfd_write]

                                     prepareToDestroy()
                                       eventfd_write(1)
                                       submitAndGet()
                                       processCompletionsAndHandleOverflow(...)
                                     handleEventFdRead()
                                       if (!eventFdClosing) {
                                         flag = false
                                       }

                                     destroy()
                                       drainEventFd()
                                         flag.getAndSet(true) -> false
                                         concludes "no pending wakeup"

                                     completeRingClose()
                                       close(eventfd)

T1 resumes
  eventfd_write(eventfd)
  -> EBADF

I also considered setting eventFdClosing earlier, but that changes a different part of the eventfd lifecycle. When eventFdClosing is set, handleEventFdRead() stops clearing eventfdAsyncNotify and stops submitting the next eventfd read:

handleEventFdRead()
  eventfdReadSubmitted = 0

  if (!eventFdClosing) {
      eventfdAsyncNotify = false
      submitEventFdRead()
  }

So if eventFdClosing is moved too early into prepareToDestroy(), shutdown can get stuck in this shape:

prepareToDestroy()
  eventFdClosing = true

eventfd read completion arrives
  handleEventFdRead()
    does not clear eventfdAsyncNotify
    does not submit another eventfd read

destroy()
  drainEventFd()
    sees eventfdAsyncNotify still true
    waits for an eventfd read completion that will not be submitted

Modification:

Add a small wakeup write gate around eventfd_write():

wakeup thread                         event-loop thread
-------------                         -----------------
reserve wakeup writer slot

eventfd_write(eventfd)

release wakeup writer slot
                                      close gate
                                      wait until writer count is 0
                                      close(eventfd)

If the close gate is already closed, wakeup() returns without writing to the eventfd. At that
point there is no event loop left to wake up, so dropping the wakeup is the correct behavior.

I intentionally did not add a larger eventfd state machine or more eventfd write states. The race does not require modeling the whole eventfd lifecycle.

This approach does add some extra atomic read/write overhead on the wakeup path. However, compared with the state-machine based alternative, it keeps the fix more easier to read about.

I did not add a normal unit test for this race because the problematic window is between the eventfdAsyncNotify state transition and the native eventfd_write() call, and there is no good deterministic way to force that interleaving without adding test-only control points to IoUringIoHandler.

Result:

Fixes #16716.

I verified the fix with a dedicated reproducer that pauses wakeup() in the race window: https://github.com/dreamlike-ocean/netty/tree/repro-16716-eventfd-race

The reproducer fails before this change with eventfd_write(...) failed: Bad file descriptor and passes after applying the fix.

I may not fully understand all the details of this race yet, please take a look together and help verify whether this fix is reasonable.
@franz1981 @normanmaurer @tsegismont

Motivation:

Fixes a shutdown race in the io_uring transport where `wakeup()` can
write to an eventfd after it has already been closed.

The race looks like this:

```text
Thread T1 (non-eventloop)            Eventloop thread
-------------------------            ----------------
wakeup()
  getAndSet(true) -> false
  flag = true
  [preempted before eventfd_write]

                                     prepareToDestroy()
                                       eventfd_write(1)
                                       submitAndGet()
                                       processCompletionsAndHandleOverflow(...)
                                     handleEventFdRead()
                                       if (!eventFdClosing) {
                                         flag = false
                                       }

                                     destroy()
                                       drainEventFd()
                                         flag.getAndSet(true) -> false
                                         concludes "no pending wakeup"

                                     completeRingClose()
                                       close(eventfd)

T1 resumes
  eventfd_write(eventfd)
  -> EBADF
```

I also considered setting `eventFdClosing` earlier, but that changes a
different part of the eventfd lifecycle. When `eventFdClosing` is set,
`handleEventFdRead()` stops clearing `eventfdAsyncNotify` and stops
submitting the next eventfd read:

```text
handleEventFdRead()
  eventfdReadSubmitted = 0

  if (!eventFdClosing) {
      eventfdAsyncNotify = false
      submitEventFdRead()
  }
```

So if `eventFdClosing` is moved too early into `prepareToDestroy()`,
shutdown can get stuck in this shape:

```text
prepareToDestroy()
  eventFdClosing = true

eventfd read completion arrives
  handleEventFdRead()
    does not clear eventfdAsyncNotify
    does not submit another eventfd read

destroy()
  drainEventFd()
    sees eventfdAsyncNotify still true
    waits for an eventfd read completion that will not be submitted
```

Modification:

Add a small wakeup write gate around `eventfd_write()`:

```text
wakeup thread                         event-loop thread
-------------                         -----------------
reserve wakeup writer slot

eventfd_write(eventfd)

release wakeup writer slot
                                      close gate
                                      wait until writer count is 0
                                      close(eventfd)
```

If the close gate is already closed, `wakeup()` returns without writing
to the eventfd. At that
point there is no event loop left to wake up, so dropping the wakeup is
the correct behavior.

I intentionally did not add a larger eventfd state machine or more
eventfd write states. The race does not require modeling the whole
eventfd lifecycle.

This approach does add some extra atomic read/write overhead on the
wakeup path. However, compared with the state-machine based alternative,
it keeps the fix more easier to read about.

I did not add a normal unit test for this race because the problematic
window is between the `eventfdAsyncNotify` state transition and the
native `eventfd_write()` call, and there is no good deterministic way to
force that interleaving without adding test-only control points to
`IoUringIoHandler`.

Result:

Fixes #16716.

(cherry picked from commit b022b47)
@normanmaurer normanmaurer merged commit ac59a89 into 5.0 May 21, 2026
12 of 13 checks passed
@normanmaurer normanmaurer deleted the auto-port-pr-16836-to-5.0 branch May 21, 2026 18:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants