Skip to content

Deadlock in grpc due to recursive grpc call #2515

@garrettjonesgoogle

Description

@garrettjonesgoogle

Please answer these questions before submitting your issue.

What version of gRPC are you using?

1.0.1

What JVM are you using (java -version)?

openjdk version "1.8.0_102"
OpenJDK Runtime Environment (build 1.8.0_102)
OpenJDK 64-Bit Server VM (build 25.102-b01, mixed mode)

What did you do?

If possible, provide a recipe for reproducing the error.

  • Turned on FINE logging
  • Used the Logging client in google-cloud-java

What did you expect to see?

Not what I saw below...

What did you see instead?

Deadlock.

Found one Java-level deadlock:
=============================
"grpc-default-worker-ELG-1-1":
  waiting to lock monitor 0x00007fe628006318 (object 0x00000005cf4cead0, a java.lang.Object),
  which is held by "main"
"main":
  waiting to lock monitor 0x00007fe628002d78 (object 0x00000005ce107468, a java.lang.Object),
  which is held by "pool-1-thread-1"
"pool-1-thread-1":
  waiting to lock monitor 0x00007fe628006318 (object 0x00000005cf4cead0, a java.lang.Object),
  which is held by "main"

Java stack information for the threads listed above:
===================================================
"grpc-default-worker-ELG-1-1":
        at io.grpc.internal.DelayedClientTransport.newStream(DelayedClientTransport.java:121)
        - waiting to lock <0x00000005cf4cead0> (a java.lang.Object)
        at io.grpc.internal.ClientCallImpl.start(ClientCallImpl.java:214)
        at io.grpc.auth.ClientAuthInterceptor$1.checkedStart(ClientAuthInterceptor.java:104)
        at io.grpc.ClientInterceptors$CheckedForwardingClientCall.start(ClientInterceptors.java:195)
        at io.grpc.ForwardingClientCall.start(ForwardingClientCall.java:47)
        at com.google.api.gax.grpc.HeaderInterceptor$1.start(HeaderInterceptor.java:62)
        at io.grpc.stub.ClientCalls.startCall(ClientCalls.java:273)
        at io.grpc.stub.ClientCalls.asyncUnaryRequestCall(ClientCalls.java:252)
        at io.grpc.stub.ClientCalls.futureUnaryCall(ClientCalls.java:189)
        at com.google.api.gax.grpc.DirectCallable.futureCall(DirectCallable.java:60)
        at com.google.api.gax.grpc.ReentranceDetectingCallable.futureCall(ReentranceDetectingCallable.java:56)
        at com.google.api.gax.grpc.ExceptionTransformingCallable.futureCall(ExceptionTransformingCallable.java:64)
        at com.google.api.gax.grpc.RetryingCallable$RetryingResultFuture.issueCall(RetryingCallable.java:220)
        - locked <0x00000005cf8472e8> (a java.lang.Object)
        at com.google.api.gax.grpc.RetryingCallable.futureCall(RetryingCallable.java:88)
        at com.google.api.gax.grpc.UnaryCallable.futureCall(UnaryCallable.java:238)
        at com.google.api.gax.grpc.UnaryCallable.futureCall(UnaryCallable.java:249)
        at com.google.cloud.logging.spi.DefaultLoggingRpc.write(DefaultLoggingRpc.java:202)
        at com.google.cloud.logging.LoggingImpl.writeAsync(LoggingImpl.java:504)
        at com.google.cloud.logging.LoggingImpl.write(LoggingImpl.java:499)
        at com.google.cloud.logging.LoggingHandler.write(LoggingHandler.java:382)
        at com.google.cloud.logging.LoggingHandler.flush(LoggingHandler.java:407)
        at com.google.cloud.logging.LoggingHandler.publish(LoggingHandler.java:318)
        at java.util.logging.Logger.log(Logger.java:738)
        at io.netty.util.internal.logging.JdkLogger.log(JdkLogger.java:606)
        at io.netty.util.internal.logging.JdkLogger.debug(JdkLogger.java:186)
        at io.netty.util.internal.logging.AbstractInternalLogger.log(AbstractInternalLogger.java:147)
        at io.netty.handler.codec.http2.Http2FrameLogger.log(Http2FrameLogger.java:183)
        at io.netty.handler.codec.http2.Http2FrameLogger.logSettings(Http2FrameLogger.java:111)
        at io.netty.handler.codec.http2.Http2OutboundFrameLogger.writeSettings(Http2OutboundFrameLogger.java:81)
        at io.netty.handler.codec.http2.DefaultHttp2ConnectionEncoder.writeSettings(DefaultHttp2ConnectionEncoder.java:246)
        at io.netty.handler.codec.http2.DecoratingHttp2FrameWriter.writeSettings(DecoratingHttp2FrameWriter.java:70)
        at io.netty.handler.codec.http2.Http2ConnectionHandler$PrefaceDecoder.sendPreface(Http2ConnectionHandler.java:324)
        at io.netty.handler.codec.http2.Http2ConnectionHandler$PrefaceDecoder.channelActive(Http2ConnectionHandler.java:224)
        at io.netty.handler.codec.http2.Http2ConnectionHandler.channelActive(Http2ConnectionHandler.java:363)
        at io.grpc.netty.AbstractNettyHandler.channelActive(AbstractNettyHandler.java:78)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelActive(AbstractChannelHandlerContext.java:223)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelActive(AbstractChannelHandlerContext.java:209)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelActive(AbstractChannelHandlerContext.java:202)
        at io.netty.handler.ssl.SslHandler.channelActive(SslHandler.java:1398)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelActive(AbstractChannelHandlerContext.java:223)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelActive(AbstractChannelHandlerContext.java:209)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelActive(AbstractChannelHandlerContext.java:202)
        at io.netty.channel.DefaultChannelPipeline$HeadContext.channelActive(DefaultChannelPipeline.java:1322)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelActive(AbstractChannelHandlerContext.java:223)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelActive(AbstractChannelHandlerContext.java:209)
        at io.netty.channel.DefaultChannelPipeline.fireChannelActive(DefaultChannelPipeline.java:902)
        at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.fulfillConnectPromise(AbstractNioChannel.java:305)
        at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:335)
        at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:588)
        at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:512)
        at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:426)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:398)
        at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:877)
        at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144)
        at java.lang.Thread.run(Thread.java:745)
"main":
        at io.grpc.internal.InUseStateAggregator.updateObjectInUse(InUseStateAggregator.java:51)
        - waiting to lock <0x00000005ce107468> (a java.lang.Object)
        at io.grpc.internal.TransportSet$BaseTransportListener.transportInUse(TransportSet.java:373)
        at io.grpc.internal.DelayedClientTransport.newStream(DelayedClientTransport.java:129)
        - locked <0x00000005cf4cead0> (a java.lang.Object)
        at io.grpc.internal.ClientCallImpl.start(ClientCallImpl.java:214)
        at io.grpc.auth.ClientAuthInterceptor$1.checkedStart(ClientAuthInterceptor.java:104)
        at io.grpc.ClientInterceptors$CheckedForwardingClientCall.start(ClientInterceptors.java:195)
        at io.grpc.ForwardingClientCall.start(ForwardingClientCall.java:47)
        at com.google.api.gax.grpc.HeaderInterceptor$1.start(HeaderInterceptor.java:62)
        at io.grpc.stub.ClientCalls.startCall(ClientCalls.java:273)
        at io.grpc.stub.ClientCalls.asyncUnaryRequestCall(ClientCalls.java:252)
        at io.grpc.stub.ClientCalls.futureUnaryCall(ClientCalls.java:189)
        at com.google.api.gax.grpc.DirectCallable.futureCall(DirectCallable.java:60)
        at com.google.api.gax.grpc.ReentranceDetectingCallable.futureCall(ReentranceDetectingCallable.java:56)
        at com.google.api.gax.grpc.ExceptionTransformingCallable.futureCall(ExceptionTransformingCallable.java:64)
        at com.google.api.gax.grpc.RetryingCallable$RetryingResultFuture.issueCall(RetryingCallable.java:220)
        - locked <0x00000005cd8b7740> (a java.lang.Object)
        at com.google.api.gax.grpc.RetryingCallable.futureCall(RetryingCallable.java:88)
        at com.google.api.gax.grpc.UnaryCallable.futureCall(UnaryCallable.java:238)
        at com.google.api.gax.grpc.UnaryCallable.futureCall(UnaryCallable.java:249)
        at com.google.cloud.logging.spi.DefaultLoggingRpc.write(DefaultLoggingRpc.java:202)
        at com.google.cloud.logging.LoggingImpl.writeAsync(LoggingImpl.java:504)
        at com.google.cloud.logging.LoggingImpl.write(LoggingImpl.java:499)
        at com.google.cloud.logging.LoggingHandler.write(LoggingHandler.java:382)
        at com.google.cloud.logging.LoggingHandler.flush(LoggingHandler.java:407)
        at com.google.cloud.logging.LoggingHandler.publish(LoggingHandler.java:318)
        at java.util.logging.Logger.log(Logger.java:738)
        at java.util.logging.Logger.doLog(Logger.java:765)
        at java.util.logging.Logger.log(Logger.java:788)
        at java.util.logging.Logger.info(Logger.java:1490)
        at com.google.cloud.runtimes.jetty9.StackDriverLogging.init(StackDriverLogging.java:37)
        at com.google.cloud.runtimes.jetty9.StackDriverLogging.main(StackDriverLogging.java:64)
"pool-1-thread-1":
        at io.grpc.internal.DelayedClientTransport.newStream(DelayedClientTransport.java:121)
        - waiting to lock <0x00000005cf4cead0> (a java.lang.Object)
        at io.grpc.internal.ClientCallImpl.start(ClientCallImpl.java:214)
        at io.grpc.auth.ClientAuthInterceptor$1.checkedStart(ClientAuthInterceptor.java:104)
        at io.grpc.ClientInterceptors$CheckedForwardingClientCall.start(ClientInterceptors.java:195)
        at io.grpc.ForwardingClientCall.start(ForwardingClientCall.java:47)
        at com.google.api.gax.grpc.HeaderInterceptor$1.start(HeaderInterceptor.java:62)
        at io.grpc.stub.ClientCalls.startCall(ClientCalls.java:273)
        at io.grpc.stub.ClientCalls.asyncUnaryRequestCall(ClientCalls.java:252)
        at io.grpc.stub.ClientCalls.futureUnaryCall(ClientCalls.java:189)
        at com.google.api.gax.grpc.DirectCallable.futureCall(DirectCallable.java:60)
        at com.google.api.gax.grpc.ReentranceDetectingCallable.futureCall(ReentranceDetectingCallable.java:56)
        at com.google.api.gax.grpc.ExceptionTransformingCallable.futureCall(ExceptionTransformingCallable.java:64)
        at com.google.api.gax.grpc.RetryingCallable$RetryingResultFuture.issueCall(RetryingCallable.java:220)
        - locked <0x00000005cfbce9e0> (a java.lang.Object)
        at com.google.api.gax.grpc.RetryingCallable.futureCall(RetryingCallable.java:88)
        at com.google.api.gax.grpc.UnaryCallable.futureCall(UnaryCallable.java:238)
        at com.google.api.gax.grpc.UnaryCallable.futureCall(UnaryCallable.java:249)
        at com.google.cloud.logging.spi.DefaultLoggingRpc.write(DefaultLoggingRpc.java:202)
        at com.google.cloud.logging.LoggingImpl.writeAsync(LoggingImpl.java:504)
        at com.google.cloud.logging.LoggingImpl.write(LoggingImpl.java:499)
        at com.google.cloud.logging.LoggingHandler.write(LoggingHandler.java:382)
        at com.google.cloud.logging.LoggingHandler.flush(LoggingHandler.java:407)
        at com.google.cloud.logging.LoggingHandler.publish(LoggingHandler.java:318)
        at java.util.logging.Logger.log(Logger.java:738)
        at java.util.logging.Logger.doLog(Logger.java:765)
        at java.util.logging.Logger.log(Logger.java:851)
        at io.grpc.internal.TransportSet.startNewTransport(TransportSet.java:217)
        at io.grpc.internal.TransportSet.obtainActiveTransport(TransportSet.java:192)
        - locked <0x00000005ce107468> (a java.lang.Object)
        at io.grpc.internal.ManagedChannelImpl$3.getTransport(ManagedChannelImpl.java:637)
        at io.grpc.internal.ManagedChannelImpl$3.getTransport(ManagedChannelImpl.java:579)
        at io.grpc.DummyLoadBalancerFactory$DummyLoadBalancer$1.get(DummyLoadBalancerFactory.java:135)
        at io.grpc.internal.DelayedClientTransport$2.run(DelayedClientTransport.java:262)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:295)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)

Background: I have been trying various strategies to resolve googleapis/google-cloud-java#1386 , where using the Logging service at level FINE results in grpc logging to the Logging service in a recursive way. I tried using a ThreadLocal to prevent this, but this doesn't work with grpc because the actual call is executed on a worker thread. Essentially I think I need some way to bail out of the LoggingHandler.publish call if I can detect that this is in the scope of a grpc worker thread sending a request to the Logging service.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions