-
Notifications
You must be signed in to change notification settings - Fork 3.7k
fix bug that using invalid token to retry fetching layer #9274
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Hi @yyb196. Thanks for your PR. I'm waiting for a containerd member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
21ca509 to
72ded46
Compare
|
This seems to be related to #8735 . |
|
@ktock may be we encounter the same issue, but provide two different solutions. the solution of 8735 depend on that registry return the expire time of token. we better off to fix the retry policy in containerd, since it is more compatible to the existing registries. |
|
@ktock @yyb196 I think we should probably take both solution. The other one prevents using tokens when we know are expired and this one does better handling of rejected tokens. For this one I wonder if it is worth a debug log to explain why the token was rejected using the error message (limiting the length of the message for malformed error messages). |
remotes/docker/authorizer.go
Outdated
| for _, c := range auth.ParseAuthHeader(last.Header) { | ||
| if c.Scheme == auth.BearerAuth { | ||
| if err := invalidAuthorization(c, responses); err != nil { | ||
| if retry, err := invalidAuthorization(c, responses); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will one retry with no delay between be sufficient to get over the issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't thin so, as the method name invalidAuthorization hinted, the registry server has found the auth invalid, retry with the same token is useless.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mmerkes Have I explained your question? this pr still need your approvement.
54afc07 to
4adce5c
Compare
@dmcgowan As your suggestion, I have added a debug log with a length limited error msg received from registry server. |
dmcgowan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, just a couple nits
remotes/docker/authorizer.go
Outdated
| if n == 1 || (n > 1 && !sameRequest(responses[n-2].Request, responses[n-1].Request)) { | ||
| return nil | ||
| limitedErr := errStr | ||
| if len(limitedErr) > 32 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
64 would to fit a full error message
remotes/docker/authorizer.go
Outdated
| if len(limitedErr) > 32 { | ||
| limitedErr = limitedErr[:32] + "..." | ||
| } | ||
| log.G(ctx).WithField("error", limitedErr).Debug("need retry authorization") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"authorization error using bearer token, retrying"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
modified, PTALA
Signed-off-by: frankyang <[email protected]>
4adce5c to
09e4051
Compare
| } | ||
|
|
||
| func invalidAuthorization(c auth.Challenge, responses []*http.Response) error { | ||
| func invalidAuthorization(ctx context.Context, c auth.Challenge, responses []*http.Response) (retry bool, _ error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: retry seems to be always false so maybe it's unneeded? (EDIT: true case doesn't seem to use retry)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is to hint what this return value means
|
Is this PR already cherry-picked into 1.6.x and 1.7.x? |
Update fork-external main with upstream main @ 452ec25 Related work items: containerd#5890, containerd#7647, containerd#9218, containerd#9233, containerd#9258, containerd#9270, containerd#9274, containerd#9279, containerd#9283, containerd#9286, containerd#9289, containerd#9290, containerd#9294, containerd#9295, containerd#9297, containerd#9305, containerd#9306, containerd#9308, containerd#9316, containerd#9317, containerd#9319, containerd#9320, containerd#9321
I ran into a problem that pulling image report error:
And kubelet retry pulling the next time, it just successed.
After look into the detail I found that the image is a GPU image,
and we limit concurrent pulling number to 4, the token fetch from
registry will timeout in 10 minutes. when the very later layer get their
turn to download their blob, the token has become invalid.
Containerd has the policy to retry pulling when 401 returned from
the registry but if invalid authorization error returned, the cached
token didn't removed, and the next retry just don't work.