Add a null check for the endpoint on shutdown#15727
Conversation
|
|
|
markdroth
left a comment
There was a problem hiding this comment.
Good diagnosis of this problem! I think the fix is not quite right, though.
Is there some reasonable way to write a test to catch this?
src/core/lib/channel/handshaker.cc
Outdated
There was a problem hiding this comment.
I think we want to reset error on the next line even if the endpoint is null. Otherwise, we will invoke mgr->on_handshake_done with no error, and the caller will expect the endpoint to be non-null, so we'd just be moving the crash to a different place.
|
|
I don't see a way to force this failure case with any end-to-end tests because the core surface has no guarantees about when flush gets called. I think the only way to test this would be set up some kind of mock no-op handshaker and a unit test for the handshake manager. |
markdroth
left a comment
There was a problem hiding this comment.
Thanks for doing this, Ken!
I think the only way to test this would be set up some kind of mock no-op handshaker and a unit test for the handshake manager.
I agree. @yashykt, that's probably something to think about doing in the future, maybe as part of the C++-ification of the handshaker APIs.
src/core/lib/channel/handshaker.cc
Outdated
There was a problem hiding this comment.
Please add a comment explaining the race condition that we're catching here (the case you describe in the PR description).
|
|
|
|
|
|
|
|
We are seeing a null pointer exception internally on the call to
grpc_endpoint_shutdown()a few lines below internally.Hypothetical race condition:
1: A handshaker completes successfully
2:
on_handshake_doneis scheduled on theExecCtx(with no error)3:
grpc_handshake_manager_shutdown()is invoked on some other thread.4: The handshakers shutdown function shuts down the endpoint and sets it to null (see rpc-switch)
5:
ExecCtx::Flush()gets invoked on the original thread,on_handshake_doneis invoked withGRPC_ERROR_NONEandmgr->shutdown=true, and the endpoint has already been nulled out.