-
Notifications
You must be signed in to change notification settings - Fork 653
[WIP] Update grpc etc #2631
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Update grpc etc #2631
Conversation
|
|
Race detector appears to be triggered because GRPC previously did not call the |
|
ping @cyli ^^ feels like your area 🤗 |
|
The other failures are the same damned bug as #2452, which is that the returned error from GRPC changed again. |
|
Yeah, the problem is once again that we try to fail on an error that otherwise is transient, and GRPC no longer as of that version exposes the error we need. Is there anyone in Docker with more experience with GRPC that might know the approach to fix this? |
Ugh.. I saw the failures and actually thought it could be (but it was late and didn't look into that) |
|
As far as I can tell, there is no longer any way to get the root error out, which means we can't differentiate cert errors from other errors. |
|
Boo! We should open a ticket for that upstream (breaking change.. again); do you think we should open a PR upstream? While looking at the code, I think I saw that other errors have a |
|
@dperny could this be the cause? grpc/grpc-go@7aea499#diff-3081f1aaaf958d7d5e00ccec7a62ea6f Not sure if we hit that part, but it's changing |
Codecov Report
@@ Coverage Diff @@
## master #2631 +/- ##
==========================================
- Coverage 67.21% 58.26% -8.96%
==========================================
Files 104 10 -94
Lines 15450 1253 -14197
==========================================
- Hits 10384 730 -9654
+ Misses 3998 453 -3545
+ Partials 1068 70 -998 |
|
YOLO - pushed a commit to see if that was indeed the problem 😅 edit: nope, that didn't work; removed again |
|
Negative, the error that's bubbling up instead of the x509 error is this: my understanding of what changed in GRPC is that it now integrally uses a The end result, though, is that calling grpc tries to get one of these SubConns to use. But, because the x509 error is considered "Transient", we get back an error that all of the SubConns are in a transient failure state (meaning we could expect them to work again at some point, except the whole point of the broken code is to catch a case where we know the system won't work again at any point). |
fcb8f13 to
b3ad347
Compare
|
@dperny do you see a path forward on this or need help looking into it? |
|
Looking into the |
|
Update: So for some of the tests, we can just use our own GRPC dialer. I'm less happy about doing this in our actual code however, because we'd have to write a dialer that did the TLS handshake ourselves, and then pass the error back via some other channel. If we don't use our own dialer for the code, then:
There is apparently a workaround in 1.11 (see grpc/grpc-go#1855 and grpc/grpc-go#1917) where if you make an RPC call instead of dialing with However, in 1.11, Can we upgrade swarmkit to 1.11+ instead, or do we have to be on 1.10? If we want to stay on 1.10, then maybe we just live with the slight change in behavior, and I'll just skip checking those errors on tests I guess. If we want to upgrade to 1.11+ (which we'd eventually need to do anyway) then I can work on trying to get around our use of |
|
Yes, in the long run, we will be updating to newer versions (and it's probably good to update more frequent so that we catch breaking changes early) @dmcgowan - are there plans to update containerd to a more recent version, and if so, would that be a change that's acceptable for an 1.1.x patch release of containerd? |
We would like to keep containerd as up to date with GRPC as possible. We don't need to backport to a release since our primary concern here is the client end, which does not need to be vendored to a release. |
Signed-off-by: Sebastiaan van Stijn <[email protected]>
Signed-off-by: Sebastiaan van Stijn <[email protected]>
Signed-off-by: Sebastiaan van Stijn <[email protected]>
Signed-off-by: Sebastiaan van Stijn <[email protected]>
b3ad347 to
97d0e4e
Compare
|
superseded by #2649 |
Trying to get these the same as containerd, but looks like we need to bump protobuf to 1.1.0, to make it build;
It builds now, but CI probably fails