Install MWC and VWC During Control Plane Installation#2796
Install MWC and VWC During Control Plane Installation#2796
Conversation
Signed-off-by: Ivan Sim <[email protected]>
Signed-off-by: Ivan Sim <[email protected]>
Signed-off-by: Ivan Sim <[email protected]>
Signed-off-by: Ivan Sim <[email protected]>
Signed-off-by: Ivan Sim <[email protected]>
|
Integration test results for 2ee6a45: fail 😕 |
2ee6a45 to
6ef3eea
Compare
Signed-off-by: Ivan Sim <[email protected]>
6ef3eea to
c56df00
Compare
|
Integration test results for c56df00: fail 😕 |
alpeb
left a comment
There was a problem hiding this comment.
Looking good so far, I haven't got into the meaty stuff yet, but here are some minor nits/tiolis ;-)
There was a problem hiding this comment.
The changes required minimal impact to the existing webhooks framework, which is awesome 👍
Regarding the certificate renewal, IIUC, the identity trust root suffers from the same problem, as both use tls.GenerateRootCAWithDefaults() which defaults to a year. Something we might wanna tackle simultaneously I guess.
You also need to add to the to-do list the removal of the code for generating the old configs dynamically. From what I see both hooks are still doing that, although from my testing I couldn't see that the webhook configs set during install were being overwritten when the webhooks started.
I also tested HA for the profile injector, using 2 replicas. For some reason only one of the pods gets used. I thought it could be the last injector being started overwriting the config and so grabbing all the work for itself, but like I said above I couldn't see the install config being overwritten...
cli/cmd/install.go
Outdated
| defaultIdentityIssuanceLifetime = 24 * time.Hour | ||
| defaultIdentityClockSkewAllowance = 20 * time.Second | ||
|
|
||
| caCommonName = "ca.linkerd.cluster.local" |
There was a problem hiding this comment.
curious about this name, should it be namespace-specific? also is there a reason this is different from most *.svc.cluster.local names, or is this convention?
There was a problem hiding this comment.
Yeah, I didn't put much thought into this name. LMK what's most appropriate. I basically follow the identity issuer's name convention; i.e. without the .svc (which makes sense, since there isn't a Service with this name).
There was a problem hiding this comment.
I think following the identity issuer's name makes sense. Note that the name varies by namespace.
|
|
||
| Identity *installIdentityValues | ||
|
|
||
| CATrust *caTrustValues |
There was a problem hiding this comment.
i'm a little out of context, can you explain the motivation for separate KeyPEM/CertPem values for the MWC/VWC and the identify system values in installIdentityValues? or, taking this a step further, re-using the linkerd-identity-issuer secret?
There was a problem hiding this comment.
Based on my brief chat with @olix0r, we prefer to keep them separated, because they exist for different purposes. installIdentityValues, as we all know, is used for data plane mTLS and consist of fields like issuance life time, clock skew allowances etc. that aren't relevant to the CA. The CA is used to sign certs that will be used by webhooks and taps to authenticate with the k8s API Server.
Signed-off-by: Ivan Sim <[email protected]>
That's surprising, because the
Yeah, good point. I am kinda hesitant at first, because in the past, we did have users encountering outdated MWC after upgrading the version (before |
Signed-off-by: Ivan Sim <[email protected]>
|
Integration test results for 610ec3c: success 🎉 |
Signed-off-by: Ivan Sim <[email protected]>
|
Re HA, for some reason k8s continues to use only one webhook, but if you kill it, the other one starts receiving requests, so this should be fine 👍 |
|
Integration test results for be7e8b3: success 🎉 |
|
@alpeb Yeah, it's very strange that the k8s api server seems to only send requests to one replica. AFACIT, the If anything, it's good that even when some of the replicas entered into a crash loop, auto-inject still works. |
Signed-off-by: Ivan Sim <[email protected]>
|
Integration test results for 82225e9: success 🎉 |
siggy
left a comment
There was a problem hiding this comment.
a few more comments.
i hit an issue where the proxy-injector was not injecting proxies. this was on a 2-stage install, but not sure if that's related:
$ bin/linkerd install config | kubectl apply -f -
$ bin/linkerd install control-plane | kubectl apply -f -
$ bin/linkerd check
$ curl https://run.linkerd.io/emojivoto.yml | bin/linkerd inject - | kubectl apply -f -
...
$ bin/linkerd logs --control-plane-component proxy-injector
+ linkerd linkerd-proxy-injector-56459d9d47-6q4r8 › proxy-injector
+ linkerd linkerd-proxy-injector-56459d9d47-6q4r8 › linkerd-proxy
linkerd linkerd-proxy-injector-56459d9d47-6q4r8 proxy-injector time="2019-05-08T10:52:44Z" level=info msg="running version git-82225e97"
linkerd linkerd-proxy-injector-56459d9d47-6q4r8 proxy-injector time="2019-05-08T10:52:44Z" level=info msg="waiting for caches to sync"
linkerd linkerd-proxy-injector-56459d9d47-6q4r8 proxy-injector time="2019-05-08T10:52:45Z" level=info msg="caches synced"
linkerd linkerd-proxy-injector-56459d9d47-6q4r8 proxy-injector time="2019-05-08T10:52:45Z" level=info msg="listening at :8443"
linkerd linkerd-proxy-injector-56459d9d47-6q4r8 proxy-injector time="2019-05-08T10:52:45Z" level=info msg="starting admin server on :9995"
linkerd linkerd-proxy-injector-56459d9d47-6q4r8 proxy-injector 2019/05/08 10:55:11 http: TLS handshake error from 127.0.0.1:43100: remote error: tls: bad certificate
linkerd linkerd-proxy-injector-56459d9d47-6q4r8 proxy-injector 2019/05/08 10:55:12 http: TLS handshake error from 127.0.0.1:43104: remote error: tls: bad certificate
linkerd linkerd-proxy-injector-56459d9d47-6q4r8 proxy-injector 2019/05/08 10:55:12 http: TLS handshake error from 127.0.0.1:43116: remote error: tls: bad certificate
linkerd linkerd-proxy-injector-56459d9d47-6q4r8 proxy-injector 2019/05/08 10:55:13 http: TLS handshake error from 127.0.0.1:43130: remote error: tls: bad certificate| apiVersion: admissionregistration.k8s.io/v1beta1 | ||
| kind: MutatingWebhookConfiguration | ||
| metadata: | ||
| name: linkerd-proxy-injector-webhook-config |
There was a problem hiding this comment.
is it possible to name this (and analogous sp-validator) to include the namespace: linkerd-{{.Namespace}}-proxy-injector-webhook-config
we follow this pattern with ClusterRole[Binding]s because they're global resources, it conveys which control plane they are associated with.
or, does it become a problem to have multiple webhooks installed, particularly proxy injectors?
| name: linkerd-proxy-injector | ||
| namespace: {{.Namespace}} | ||
| --- | ||
| apiVersion: admissionregistration.k8s.io/v1beta1 |
There was a problem hiding this comment.
not necessarily for this PR, but now that MWC/VWC are part of linkerd install config, we should probably update linkerd check config to validate successful install.
That's a regression from the change to move the MWC and VWC to the |
This ensures that CA secret is in-sync with the MWC and VWC trust anchors. Since the MWC and VWC are cluster-scoped, they must be created in the 'config' stage. Moving the CA secret to the 'config' stage is simpler than trying to sync it with the MWC and VWC trust anchors. Signed-off-by: Ivan Sim <[email protected]>
|
@ver @adleong To recap, this PR creates a new @ver IIUC, you are saying that we don't need to persist the CA in the secret. During install and upgrade, we just (re-)create the CA cert, private key and self-signed certs. The implication is that the MWC and VWC must be re-created during upgrade, to use the new CA assets. Is that the high level idea? |
|
Thanks for the recap, @ihcsim. My understanding is that on startup, each instance of the webhook needs to generate a signed certificate. To do this, it needs the CA certificate and CA key. This can be achieved by storing the CA cert and key in a secret (or by storing the CA key in a secret and providing the CA cert in an env var). If, as @olix0r suggests, the CA key is thrown away after install, how does a new instance of the webhook get a signed certificate? Or are you suggesting that the webhook certificate should be generated and signed during install and all webhook replicas should share that cert? |
|
I guess my point is that we don't need a CA at all... we only need a self-signed certificate. If we want to generate a root CA to generate the leaf certificates, then I don't see any value in retaining the root certificate. During upgrade, we can just read the self-signed cert & key, as we do for the issuer certs. If you ever want to create new self-signed certs for these pods, we just generate a new key. Maybe I'm getting hung up on use of the term "CA" -- are we just generating a self-signed cert? or are we generating a root CA that then signs leaf certificates? If the former, then I don't think there's anything to do but to store the keypair. If the latter, there's no need to store the root CA key; but also there's no real need to have the level of indirection, imo. |
|
After a little thought, my suggestion is that we generate a key & self-signed cert for each webhook. If all of the webhooks share a single CA and have to access the root key, then that is effectively equivalent to having them all share a keypair (trust-wise). I don't think the level of indirection that a shared CA provides grants us any security benefits. |
|
My assumption was that each webhook replica would have a distinct leaf keypair, but after talking with you, I see there's not really any value in that. So I think where we're landing with this is that at install, we should generate a (self-signed) certificate and key at startup, put the certificate in the webhook config as the "CA", and distribute the certificate and key to the webhook instances. |
|
Integration test results for 0355c06: success 🎉 |
|
Superseded by #2806. |
This PR introduces changes to install the proxy injector and profile validator MWC and VWC during the control plane installation.
Highlights
Secretnamedlinkerd-cacontrol-planestageproxy-injectorandsp-validatorto:linkerd-caas a read-only volumecert.pemandkey.pemto generate a new x509 TLS cert (via thepkg/tlslibrary)cert.pemandkey.pemof thelinkerd-casecret during upgrade (similar mechanism aslinkerd-config)tls.CA.This PR doesn't affect the existing identity mTLS.
There are no CLI flags for users to override the CA trust with custom cert and private key.
Test Cases
manualinject and auto-injectToDo
install --ignore-clusterwon't work for as it will affect the identity TLS assets.--from-manifestsintegration tests.Fixes #2176