-
Notifications
You must be signed in to change notification settings - Fork 42k
Set ExecSync timeout in liveness prober. #50176
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Set ExecSync timeout in liveness prober. #50176
Conversation
feiskyer
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
pkg/kubelet/remote/remote_runtime.go
Outdated
| // Do not set timeout when timeout is 0. | ||
| ctx, cancel = getContextWithCancel() | ||
| // Do not set timeout when timeout is 0. | ||
| ctx, cancel := getContextWithCancel() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getContextWithCancel does a bunch of stuff. We should avoid calling it if it's not going to be used (i.e. move this into an else block)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, not much, it just returns context.WithCancel(context.Background())
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but context.WithCancel potentially brings up a go routine. It might be trivial, but it's better practice not to do unecessary work when it's easily avoidable..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
9147731 to
7e06e55
Compare
pkg/kubelet/remote/remote_runtime.go
Outdated
| var cancel context.CancelFunc | ||
| if timeout != 0 { | ||
| // Use timeout + default timeout (2 minutes) as timeout. | ||
| ctx, cancel = getContextWithTimeout(r.timeout + time.Duration(timeout)*time.Second) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do you multiple the user pass in timeout with time.Second?
Not sure this is the effect you want here, see https://play.golang.org/p/4hJFFXqoga
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we probably need a unit test for this if this is indeed an issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. It looks like timeout is already a Duration?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copied the code from StopContainer. Yeah, there is a mismatch. timeout in StopContainer is int64.
Will update this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Random-Liu Is it possible to add a unit test for this?
7e06e55 to
7267576
Compare
pkg/kubelet/remote/remote_runtime.go
Outdated
| var cancel context.CancelFunc | ||
| if timeout != 0 { | ||
| // Use timeout + default timeout (2 minutes) as timeout. | ||
| ctx, cancel = getContextWithTimeout(r.timeout + timeout*time.Second) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are you multiplying the timeout by time.Second? That should have been done by the caller. I'd expect it to just be passed through here...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My fault.
pkg/kubelet/remote/remote_runtime.go
Outdated
| var ctx context.Context | ||
| var cancel context.CancelFunc | ||
| if timeout != 0 { | ||
| // Use timeout + default timeout (2 minutes) as timeout. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems a little odd to add the timeout with the default here? Can you explain the reasoning?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same with stop, we want to leave some time for the runtime to do cleanup.
Runtime can't finish running immediately when timeout exceed.
| // Use timeout + default timeout (2 minutes) as timeout. | ||
| ctx, cancel = getContextWithTimeout(r.timeout + timeout*time.Second) | ||
| } else { | ||
| ctx, cancel = getContextWithCancel() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this use the default timeout?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. We should not set any timeout by default because exec may take long time.
7267576 to
ef29b83
Compare
|
@yujuhong Sorry, I forgot to push my local fix. Pushed newest fix. |
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: Random-Liu, yujuhong Associated issue: 50389 The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these OWNERS Files:
You can indicate your approval by writing |
|
@yujuhong Thanks for reviewing! |
|
Automatic merge from submit-queue (batch tested with PRs 50536, 50809, 50220, 50399, 50176) |
Although Dockershim doesn't actually support
ExecSynctimeout (see here), we should set the timeout, so that the other runtime which supports the timeout could work properly.Fixes #50389.
/cc @yujuhong @timstclair @feiskyer