Skip to content

Concurrency issues with AFSocketServerConnector #142

@kevink-sq

Description

@kevink-sq

Describe the bug
We found our production Jetty servers would stop processing requests after traffic spikes due to junixsocket throwing ConcurrentModificationException. This bug may have been a regression introduced in optimizing the key removal: a7c3106

java.util.ConcurrentModificationException
        at java.base/java.util.HashMap$HashIterator.nextNode(HashMap.java:1597)
        at java.base/java.util.HashMap$KeyIterator.next(HashMap.java:1620)
        at org.eclipse.jetty.io.ManagedSelector$SelectorProducer.updateKeys(ManagedSelector.java:709)
        at org.eclipse.jetty.io.ManagedSelector$SelectorProducer.produce(ManagedSelector.java:539)
        at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.produceTask(AdaptiveExecutionStrategy.java:455)
        at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.tryProduce(AdaptiveExecutionStrategy.java:248)
        at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.run(AdaptiveExecutionStrategy.java:199)
        at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:411)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
        at java.base/java.lang.Thread.run(Thread.java:833)

A PR has been raised to propose a fix: #141

To Reproduce
Refer to https://github.com/kevink-sq/jetty-concurrency-issue-poc

Notes
Other issues were also detected during isolated load tests in the poc and was fixed by introducing an atomic check to AFSelectorKey#cancel. This issue appeared when running vegeta load tests and forcibly killing the vegeta load:

2023-10-25 01:39:50.857:DEBUG:oejs.AbstractConnector:etp827671026-30-acceptor-0@5ee135f3-AFSocketServerConnector@6cec2a14[org.newsclub.net.unix.AFUNIXSocketAddress[path=junix-ingress.sock]]: Accept Closed by Interrupt
java.nio.channels.ClosedByInterruptException
        at org.newsclub.net.unix.jetty.AFSocketServerConnector.accept(AFSocketServerConnector.java:352)
        at org.eclipse.jetty.server.AbstractConnector$Acceptor.run(AbstractConnector.java:748)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
        at java.base/java.lang.Thread.run(Thread.java:833)
Caused by:
java.net.SocketException: Too many open files
        at org.newsclub.net.unix.NativeUnixSocket.accept(Native Method)
        at org.newsclub.net.unix.AFSocketImpl.accept0(AFSocketImpl.java:262)
        at org.newsclub.net.unix.AFServerSocket.accept1(AFServerSocket.java:307)
        at org.newsclub.net.unix.AFServerSocketChannel.accept(AFServerSocketChannel.java:108)
        at org.newsclub.net.unix.AFUNIXServerSocketChannel.accept(AFUNIXServerSocketChannel.java:44)
        at org.newsclub.net.unix.AFUNIXServerSocketChannel.accept(AFUNIXServerSocketChannel.java:27)
        at org.newsclub.net.unix.jetty.AFSocketServerConnector.accept(AFSocketServerConnector.java:340)
        at org.eclipse.jetty.server.AbstractConnector$Acceptor.run(AbstractConnector.java:748)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
        at java.base/java.lang.Thread.run(Thread.java:833)

Metadata

Metadata

Assignees

Labels

bugThe issue describes a bug in the code

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions