Skip to content

NullPointerException and ArrayIndexOutOfBoundsException with pekko-http #15681

@filipl

Description

@filipl

Describe the bug

We get intermittent NPE and array out of bounds exceptions in processes that uses Pekko HTTP.

The interesting part of the stack trace:

java.lang.ArrayIndexOutOfBoundsException: arraycopy: last source index 36 out of bounds for byte[29]
	at java.base/java.lang.System.arraycopy(Native Method)
	at java.base/java.lang.AbstractStringBuilder.getBytes(AbstractStringBuilder.java:1669)
	at java.base/java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:613)
	at java.base/java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:629)
	at java.base/java.lang.StringBuilder.append(StringBuilder.java:209)
	at java.base/java.lang.StringBuilder.append(StringBuilder.java:91)
	at java.base/java.lang.AbstractStringBuilder.<init>(AbstractStringBuilder.java:161)
	at java.base/java.lang.StringBuilder.<init>(StringBuilder.java:143)
	at io.opentelemetry.javaagent.instrumentation.pekkohttp.v1_0.server.route.PekkoRouteHolder.save(PekkoRouteHolder.java:69)
	at io.opentelemetry.javaagent.instrumentation.pekkohttp.v1_0.server.route.PekkoRouteWrapper.apply(PekkoRouteWrapper.java:29)
	at io.opentelemetry.javaagent.instrumentation.pekkohttp.v1_0.server.route.PekkoRouteWrapper.apply(PekkoRouteWrapper.java:15)
	at org.apache.pekko.http.scaladsl.server.RouteConcatenation$RouteWithConcatenation.$anonfun$$tilde$2(RouteConcatenation.scala:58)
	at org.apache.pekko.http.scaladsl.util.FastFuture$.strictTransform$1(FastFuture.scala:49)
	at org.apache.pekko.http.scaladsl.util.FastFuture$.$anonfun$transformWith$3(FastFuture.scala:59)
	at scala.concurrent.impl.Promise$Transformation.run(Promise.scala:517)
	at org.apache.pekko.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:73)
	at org.apache.pekko.dispatch.BatchingExecutor$BlockableBatch.$anonfun$run$1(BatchingExecutor.scala:110)
	at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.scala:18)
	at scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:94)
	at org.apache.pekko.dispatch.BatchingExecutor$BlockableBatch.run(BatchingExecutor.scala:110)
	at org.apache.pekko.dispatch.TaskInvocation.run(AbstractDispatcher.scala:59)
	at org.apache.pekko.dispatch.ForkJoinExecutorConfigurator$PekkoForkJoinTask.exec(ForkJoinExecutorConfigurator.scala:61)
	at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:387)
	at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1312)
	at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1843)
	at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1808)
	at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:188)


java.lang.NullPointerException: null
	at java.base/java.lang.System.arraycopy(Native Method)
	at java.base/java.lang.AbstractStringBuilder.getBytes(AbstractStringBuilder.java:1669)
	at java.base/java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:613)
	at java.base/java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:629)
	at java.base/java.lang.StringBuilder.append(StringBuilder.java:209)
	at java.base/java.lang.StringBuilder.append(StringBuilder.java:91)
	at java.base/java.lang.AbstractStringBuilder.<init>(AbstractStringBuilder.java:161)
	at java.base/java.lang.StringBuilder.<init>(StringBuilder.java:143)
	at io.opentelemetry.javaagent.instrumentation.pekkohttp.v1_0.server.route.PekkoRouteHolder.save(PekkoRouteHolder.java:69)
	at io.opentelemetry.javaagent.instrumentation.pekkohttp.v1_0.server.route.PekkoRouteWrapper.apply(PekkoRouteWrapper.java:29)
	at io.opentelemetry.javaagent.instrumentation.pekkohttp.v1_0.server.route.PekkoRouteWrapper.apply(PekkoRouteWrapper.java:15)
	at org.apache.pekko.http.scaladsl.server.RouteConcatenation$RouteWithConcatenation.$anonfun$$tilde$2(RouteConcatenation.scala:58)
	at org.apache.pekko.http.scaladsl.util.FastFuture$.flatMap$extension(FastFuture.scala:49)
	at org.apache.pekko.http.scaladsl.server.RouteConcatenation$RouteWithConcatenation.$anonfun$$tilde$1(RouteConcatenation.scala:55)
	at org.apache.pekko.http.scaladsl.server.directives.BasicDirectives.$anonfun$mapRouteResult$2(BasicDirectives.scala:80)
	at org.apache.pekko.http.scaladsl.server.directives.BasicDirectives.$anonfun$textract$2(BasicDirectives.scala:173)
	at org.apache.pekko.http.scaladsl.server.directives.BasicDirectives.$anonfun$mapRouteResultWith$2(BasicDirectives.scala:86)
	at org.apache.pekko.http.scaladsl.server.directives.BasicDirectives.$anonfun$textract$2(BasicDirectives.scala:173)
	at org.apache.pekko.http.scaladsl.server.directives.ExecutionDirectives.$anonfun$handleExceptions$2(ExecutionDirectives.scala:42)
	at org.apache.pekko.http.scaladsl.server.Route$.$anonfun$createAsyncHandler$1(Route.scala:127)
	at org.apache.pekko.stream.impl.fusing.MapAsyncUnordered$$anon$29.onPush(Ops.scala:1435)
	at org.apache.pekko.stream.impl.fusing.GraphInterpreter.processPush(GraphInterpreter.scala:556)
	at org.apache.pekko.stream.impl.fusing.GraphInterpreter.execute(GraphInterpreter.scala:434)
	at org.apache.pekko.stream.impl.fusing.GraphInterpreterShell.runBatch(ActorGraphInterpreter.scala:662)
	at org.apache.pekko.stream.impl.fusing.GraphInterpreterShell$AsyncInput.execute(ActorGraphInterpreter.scala:532)
	at org.apache.pekko.stream.impl.fusing.GraphInterpreterShell.processEvent(ActorGraphInterpreter.scala:637)
	at org.apache.pekko.stream.impl.fusing.ActorGraphInterpreter.org$apache$pekko$stream$impl$fusing$ActorGraphInterpreter$$processEvent(ActorGraphInterpreter.scala:813)
	at org.apache.pekko.stream.impl.fusing.ActorGraphInterpreter$$anonfun$receive$1.applyOrElse(ActorGraphInterpreter.scala:831)
	at org.apache.pekko.actor.Actor.aroundReceive(Actor.scala:547)
	at org.apache.pekko.actor.Actor.aroundReceive$(Actor.scala:545)
	at org.apache.pekko.stream.impl.fusing.ActorGraphInterpreter.aroundReceive(ActorGraphInterpreter.scala:729)
	at org.apache.pekko.actor.ActorCell.receiveMessage(ActorCell.scala:590)
	at org.apache.pekko.actor.ActorCell.invoke(ActorCell.scala:557)
	at org.apache.pekko.dispatch.Mailbox.processMailbox(Mailbox.scala:272)
	at org.apache.pekko.dispatch.Mailbox.run(Mailbox.scala:233)
	at org.apache.pekko.dispatch.Mailbox.exec(Mailbox.scala:245)
	at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:387)
	at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1312)
	at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1843)
	at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1808)
	at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:188)

Steps to reproduce

I'm at a loss trying to reproduce this in a controlled project, it happens very rarely for us in production. The only thing I have found is that I suspect that there is a race condition in io.opentelemetry.javaagent.instrumentation.pekkohttp.v1_0.server.route.PekkoRouteWrapper#apply between routeHolder.save() and routeHolder.restore() and this messes up StringBuilder route in PekkoRouteHolder. This is mostly a guess on my part, perhaps this is not meant to be thread safe, and instead should never be called from different threads concurrently.

I have not been able to get it to throw NPE or out of bound exceptions in a controlled project, but I have managed to get routeHolder.restore() to run on a different thread than routeHolder.save() which I guess should be bad since PekkoRouteHolder is not thread safe. This together with other concurrent calls to push or pushIfNotCompletelyMatched could lead to

This simple Pekko HTTP server tries to force the map in io.opentelemetry.javaagent.instrumentation.pekkohttp.v1_0.server.route.PekkoRouteWrapper#apply to run on a different thread than the one that called save:

object PekkoHttpServer {

  def main(args: Array[String]): Unit = {
    implicit val system: ActorSystem[Nothing] = ActorSystem(Behaviors.empty, "my-system")
    implicit val executionContext: ExecutionContextExecutor = system.executionContext

    val wrappedResponse: Route = { ctx =>
      ctx
        .reject()
        .flatMap { result =>
          // tries to force context switch
          Future { result }(ctx.executionContext)
        }(ctx.executionContext)
    }

    val route =
      wrappedResponse ~ path("hello") {
        complete(HttpResponse(StatusCodes.OK, entity = HttpEntity(ContentTypes.`text/plain(UTF-8)`, "Hello World")))
      }

    Http().newServerAt("localhost", 8080).bind(route)
  }
}

Running the agent with this patch somewhat consistently prints "UH OH" when save and restore is run on different threads. Again, this might be a red herring, but I can't think of any other reasons for the StringBuilder to be corrupted in this way.

uh-oh-logger.patch

Expected behavior

No NPE and OOB exceptions.

Actual behavior

Calls from multiple threads calling into code that modifies state that is probably not thread safe.

Javaagent or library instrumentation version

2.23.0

Environment

JDK: openjdk 21.0.9 2025-10-21
OS: arch linux

Additional context

No response

Tip

React with 👍 to help prioritize this issue. Please use comments to provide useful context, avoiding +1 or me too, to help us triage it. Learn more here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingneeds triageNew issue that requires triage

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions