core: delay sending cancel request on client-side when deadline expires#6328
core: delay sending cancel request on client-side when deadline expires#6328ran-su merged 19 commits intogrpc:masterfrom
Conversation
zhangkun83
left a comment
There was a problem hiding this comment.
We should also handle the case where the
8095621 to
f2cba8f
Compare
76c8bd8 to
1477e33
Compare
zhangkun83
left a comment
There was a problem hiding this comment.
LGTM.
@ejona86 it's ready for you to review now.
| stream.cancel(statusFromCancelled(context)); | ||
| } else { | ||
| if (deadlineCancellationSendToServerFuture == null) { | ||
| deadlineCancellationSendToServerFuture = deadlineCancellationExecutor.schedule( |
There was a problem hiding this comment.
I don't see anything here that triggers the application to be called immediately when the deadline is exceeded. Won't this delay all deadline notifications to the application by 1 second?
There was a problem hiding this comment.
All notifications to the application are sent in startDeadlineNotifyApplicationTimer(), which follows the initial deadline.
There was a problem hiding this comment.
In the case calloption's deadline equals to the context deadline, startDeadlineNotifyApplicationTimer() is not called when the ClientCall is started.
| deadlineCancellationNotifyApplicationFuture = | ||
| startDeadlineNotifyApplicationTimer(effectiveDeadline, observer); | ||
| } else { | ||
| if (cancellationListener instanceof ClientCallImpl.ContextCancellationListener) { |
There was a problem hiding this comment.
It's always the case. You can change cancellationListener's type to ContextCancellationListener to avoid the casting.
| startDeadlineNotifyApplicationTimer(effectiveDeadline, observer); | ||
| } else { | ||
| if (cancellationListener instanceof ClientCallImpl.ContextCancellationListener) { | ||
| ((ContextCancellationListener) cancellationListener).setObserver(observer); |
There was a problem hiding this comment.
You probably need to move line 320 (context.addListener()) to after this else-block, so that if the deadline is exceeded immediately, observer can still be notified.
| if (observer != null) { | ||
| Status status = statusFromCancelled(context); | ||
| deadlineCancellationSendToServerFuture = startDeadlineSendCancelToServerTimer(status); | ||
| executeCloseObserverInContext(observer, status); |
There was a problem hiding this comment.
These two lines are repeated in DeadlineExceededNotifyApplicationTimer. Since this is the core logic of delayed cancellation, it should be in its own method (delayedCancelOnDeadlineExceeded(status, observer)), and the two cases just call it.
| } | ||
|
|
||
| callExecutor.execute(new CloseInContext()); | ||
| private void executeCloseObserverInContext(final Listener<RespT> observer, final Status status) { |
There was a problem hiding this comment.
There are other two call sites of closeObserver() at the start of startInternal() that can be simplified by calling this method.
| TimeUnit.NANOSECONDS); | ||
| } | ||
|
|
||
| private Status buildDeadlineExceededStatusWithRemainingNanos(long remainingNanos) { |
There was a problem hiding this comment.
nit: this method is used only once, and the only caller is very short. It may be more readable as an inline codeblock at the call site.
| } | ||
| } | ||
|
|
||
| private ScheduledFuture<?> startDeadlineSendCancelToServerTimer(final Status status) { |
There was a problem hiding this comment.
nit: now that there is only one caller of this method, it would be more readable to merge it into its caller.
| // cancelling the stream, which changes the final status of the stream from the server's | ||
| // perspective. | ||
| // Mitigate this by introduce a delay for client to send the cancel request.(b/118879795) | ||
| // This only impacts server statistics. |
There was a problem hiding this comment.
The language here is a bit fragmented. How about
When a deadline is exceeded, there is a race between the server receiving the cancellation from the client and the server cancelling the stream itself. If the client's cancellation is received first, then the stream's status will be CANCELLED instead of DEADLINE_EXCEEDED. This prevents server monitoring from noticing high rate of DEADLINE_EXCEEDED, a common monitoring metric (b/118879795). Mitigate this by delayed sending of the client's cancellation.
| @Override | ||
| public void cancelled(Context context) { | ||
| stream.cancel(statusFromCancelled(context)); | ||
| public void cancelled(final Context context) { |
| } | ||
| } | ||
|
|
||
| @SuppressWarnings("unchecked") |
| if (context.getDeadline() == null || !context.getDeadline().isExpired()) { | ||
| stream.cancel(statusFromCancelled(context)); | ||
| } else { | ||
| if (observer != null) { |
There was a problem hiding this comment.
I'm not very comfortable with this doing nothing if observer is null. It looks like an optimization, but it requires multiple pieces to behave in a particular way. I think it would be better to add the optimization to delayedCancelOnDeadlineExceeded(): if deadlineCancellationSendToServerFuture is already non-null (the method has already been called), return early. Note there's a race there, but there's a race there already (and so I'll have another comment about it elsewhere).
Changing this would also mean we could pass the Listener in the constructor and get rid of setObserver().
ejona86
left a comment
There was a problem hiding this comment.
One important change, but should be easy.
| deadlineCancellationNotifyApplicationFuture = | ||
| startDeadlineNotifyApplicationTimer(effectiveDeadline, observer); | ||
| } | ||
| cancellationListener = new ContextCancellationListener(observer); |
There was a problem hiding this comment.
I don't think this is thread-safe, since this field is read in removeContextListenerAndCancelDeadlineFuture(). I think you need to just move this initialization up higher before stream.start().
When deadline expires, both the client and the server try to cancel the stream. There is a race between server receiving the cancellation from the client and the server cancelling the stream, which changes the final status of the stream from the server's perspective. If the former wins, server sees CANCELLED. If the latter wins, server sees DEADLINE_EXCEEDED.
Because
ServerCall.Listenerdoesn't pass the final status to the server-side application, this ambiguity isn't a problem in most cases. However, the status is passed toServerStreamTracer, and thus will cause confusion in stats.This PR delays the client-side cancellation in case of deadline exceeded by a small margin, thus makes DEADLINE_EXCEEDED be reported most of the time.
b/118879795