-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
containerd hangs on start in low-entropy situations #2451
Comments
It looks like this patch allows containerd to start "enough" to not block other services in LinuxKit from starting: diff --git a/services/leases/local.go b/services/leases/local.go
index d3e0c2f2c5..739e58ea9e 100644
--- a/services/leases/local.go
+++ b/services/leases/local.go
@@ -17,9 +17,9 @@
package leases
import (
- "crypto/rand"
"encoding/base64"
"fmt"
+ "math/rand"
"time"
"google.golang.org/grpc" but containerd itself still hangs, presumably due to the other uses of |
The hanging was not expected, the lease generator is intended to be designed that in low entropy situations, the ID generated is just less random rather than blocking or erroring. Looks like we need to re-address this assumption that |
This assumption was based on previous experience here https://github.com/docker/distribution/blob/master/uuid/uuid.go#L56 |
Yeah, I think a lot of userspace code that makes the non-blocking assumption is going to start hanging in very hard-to-diagnose ways on early-boot when the newer kernel versions become more prevalent 😞. Would you accept a PR to simply change The |
Yes, I agree cryptographically random is not necessary. We also add a nanosecond component, we could update this in the future to use UUID logic. |
We can also just implement a reader with |
@crosbymichael I agree that it'd be useful to have a One thing to watch for is that Probably simpler and less surprising to stick to |
@hairyhenderson ok, that makes sense. Thanks! |
#2454 got merged, which is great, and that'll solve the main blocking issue. But there appears to be something else blocking - in my LinuxKit setup I get a shell from So I'm going to keep this issue open for now - I want to dig a little deeper and see if there's another blocking |
We really need to get rid of the This does raise an important issue which was not part of the PR, random should be seeded, but we should seed it in the binary command and not in the imports. |
Yup, that'll do it. 😂 |
Is there anything else we still need to investigate here or are the 2 PRs already merged sufficient? |
👋 @estesp 😁 TBH I'm not sure. I seem to recall that the patched containerd still had some entropy-related hanging. But my memory's fuzzy on exactly what combination of things I tried. That particular project has been a bit on hold while I've enjoyed some summer vacation, but I'll try to get back to this in the next few days to see if the latest containerd still stalls. |
Contains: containerd/containerd@cce0a46 containerd/containerd@9a97ab3 Signed-off-by: Robert Günzler <[email protected]>
Update vendor.conf and vendor/ to include balena-os/balena-containerd#2 Connects-to: #105 Connects-to: containerd/containerd#2451 Signed-off-by: Robert Günzler <[email protected]>
Update vendor.conf and vendor/ to include balena-os/balena-containerd#2 Connects-to: #105 Connects-to: containerd/containerd#2451 Signed-off-by: Robert Günzler <[email protected]>
Update vendor.conf and vendor/ to include balena-os/balena-containerd#2 Connects-to: #105 Connects-to: containerd/containerd#2451 Signed-off-by: Robert Günzler <[email protected]>
Description
Based on discussion in linuxkit/linuxkit#3032 and linuxkit/linuxkit#3096, it seems that a call to
crypto/rand
'sRead
function will block when running on Linux kernel versions 4.14.36 and up, due to a fix for CVE-2018-1108.As I understand it, prior to the fix,
/dev/random
was usable before enough entropy was gathered to consider it as a crytographically-safe source. The fix effectively causes reads to/dev/random
to block until enough entropy was gathered to fully initialize the CRNG. And Go'scrypto/rand
Read
function uses the Linuxgetrandom(2)
syscall (without settingGRND_NONBLOCK
), which reads from/dev/random
and therefore now blocks.Anyway, all this boils down to this line:
containerd/services/leases/local.go
Line 118 in 01d309e
which will now block when not enough entropy is available. In my case it blocks for ~2mins, but it varies.
Based on the PR comment when that code was originally introduced, I believe that using
math/rand
instead would be "good enough" - especially:(from @dmcgowan)
Steps to reproduce the issue:
virtio-rng-pci
device may work tooDescribe the results you received:
containerd blocks for ~2mins on start
Describe the results you expected:
containerd starting ~2mins faster 😉
Output of
containerd --version
:The text was updated successfully, but these errors were encountered: