Skip to content

Introduce install stages#2719

Merged
siggy merged 10 commits intomasterfrom
siggy/install-stages
Apr 23, 2019
Merged

Introduce install stages#2719
siggy merged 10 commits intomasterfrom
siggy/install-stages

Conversation

@siggy
Copy link
Member

@siggy siggy commented Apr 18, 2019

This change introduces two named parameters for linkerd install, split
by privilege:

  • linkerd install config
    • Namespace
    • ClusterRoles
    • ClusterRoleBindings
    • CustomResourceDefinition
    • ServiceAccounts
  • linkerd install control-plane
    • ConfigMaps
    • Secrets
    • Deployments
    • Services

Comprehensive linkerd install is still supported.

TODO:

  • linkerd check support
  • linkerd upgrade support
  • integration tests

Part of #2337

Signed-off-by: Andrew Seigner [email protected]

@siggy siggy self-assigned this Apr 18, 2019
@siggy siggy requested review from alpeb and ihcsim April 18, 2019 18:24
@l5d-bot
Copy link
Collaborator

l5d-bot commented Apr 18, 2019

@siggy
Copy link
Member Author

siggy commented Apr 18, 2019

Per #2337 (comment) I'm considering scoping stage 1 strictly to cluster-level resources:

  • linkerd install config
    • Namespace
    • ClusterRoles
    • ClusterRoleBindings
    • CustomResourceDefinition
  • linkerd install control-plane
    • ConfigMaps
    • ServiceAccounts
    • Secrets
    • Deployments
    • Services

Copy link
Member

@alpeb alpeb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a good start to me 👍

Per your last comment, if you limit the first stage to just cluster-level resources, you'd increase the chance of getting warning events during the second stage.

Also users might start entering config flags in the first stage, so we'll have to persist them in the main ConfigMap in that stage. This also means we'll have to update that config with flags passed during the second stage. We can't fail in the second stage if there's already a ConfigMap, so most likely we'll have to introduce a new field to distinguish between finished installs and configed-only ones.

I added a couple of comments regarding ordering/dependencies in #2337 (comment).

linkerd install | kubectl apply -f -

# Installation may also be broken up into two stages by user privilege.
linkerd install config | kubectl apply -f -
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about adding a note that this stage requires cluster-admin privileges?

proxyInjectorConfigTemplateName = "templates/proxy_injector-config.yaml"
proxyInjectorTemplateName = "templates/proxy_injector.yaml"
spValidatorConfigTemplateName = "templates/sp_validator-config.yaml"
spValidatorTemplateName = "templates/sp_validator.yaml"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity, why did these get the axe?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since these strings only occur in one place, I prefer to just hard-code them. This way I can look at one place in the file rather than two to get my head around what's happening.

Copy link
Contributor

@ihcsim ihcsim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I assume the TODOs will happen in other PRs?

As for scoping stage 1 strictly to cluster-level resources, I think it depends if an operator/admin who runs linkerd install config cares about things like secrets and global configs or not, in particular, I am thinking of the --identity-* flags.

@olix0r
Copy link
Member

olix0r commented Apr 19, 2019

My expectation is the linkerd control plane admin will have to care about the issuer credentials; not the cluster admin.

@siggy siggy force-pushed the siggy/install-stages branch from 507f9f3 to bb9fc00 Compare April 19, 2019 17:56
@siggy
Copy link
Member Author

siggy commented Apr 19, 2019

@alpeb

Per your last comment, if you limit the first stage to just cluster-level resources, you'd increase the chance of getting warning events during the second stage.

I think this is ok for now (at least not worse than current state). Per #2337 (comment), Kubernetes is designed to handle not everything being available during startup, and I'm increasingly feeling that we should not try to control/solve deployment ordering.

Also users might start entering config flags in the first stage, so we'll have to persist them in the main ConfigMap in that stage. This also means we'll have to update that config with flags passed during the second stage. We can't fail in the second stage if there's already a ConfigMap, so most likely we'll have to introduce a new field to distinguish between finished installs and configed-only ones.

I've rebased the whole PR to make the first stage cluster-level resources only, and also added some code to disable most flags for the first stage.

@l5d-bot
Copy link
Collaborator

l5d-bot commented Apr 19, 2019

@siggy siggy force-pushed the siggy/install-stages branch from bb9fc00 to 7e4ad1f Compare April 19, 2019 18:38
@l5d-bot
Copy link
Collaborator

l5d-bot commented Apr 19, 2019

@siggy siggy force-pushed the siggy/install-stages branch from 7e4ad1f to a1d4f9b Compare April 19, 2019 19:40
@l5d-bot
Copy link
Collaborator

l5d-bot commented Apr 19, 2019

if f.Changed {
switch f.Name {
// TODO: remove "proxy-auto-inject" when it becomes default
case "linkerd-namespace", "proxy-auto-inject":
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should add to this list ignore-cluster and all the global config flags.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant all the "global flags" (--api-addr, etc)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The flags var passed into this function comes from installOptions.recordableFlagSet(), which does not include the "global flags".

$ bin/go-run cli install config --api-addr foo > /dev/null && echo $?
0
$ bin/go-run cli install config --admin-port 123 > /dev/null && echo $?
Error: flag not available for config stage: --admin-port
Usage:

... that said, i'm realizing that there is also installOnlyFlagSet, none of which should be set during linkerd install config, i have pushed a new commit to reflect this. i don't think linkerd install config --ignore-cluster should be supported, as it has no effect on the output.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

linkerd install config --ignore-cluster should be supported, as it has no effect on the output.

linkerd install and linkerd install config both check if linkerd already exists in the cluster, so I was thinking on allowing that flag in both cases for consistency.

Copy link
Member Author

@siggy siggy Apr 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops sorry i mean linkerd install config --ignore-cluster should not be supported (i think you read it the way i intended anyway ;) ) ...

nice catch! i failed to realize that linkerd install config would error out if a cluster already existed, and not providing an --ignore-cluster flag leaves the user completely unable to render that output. i've pushed a new commit to always allow linkerd install config to succeed, regardless of cluster state.

@alpeb
Copy link
Member

alpeb commented Apr 19, 2019

If not through some sort of state field in the global ConfigMap, what's the plan for ensuring linkerd install control-plane gets run after linkerd instal config? I assume listing ClusterRoles and ClusterRoleBindings is not available under simple namespace-level privileges, so we could only maybe check the existence of the linkerd namespace, which would be a very limited check...

@siggy
Copy link
Member Author

siggy commented Apr 19, 2019

If not through some sort of state field in the global ConfigMap, what's the plan for ensuring linkerd install control-plane gets run after linkerd instal config? I assume listing ClusterRoles and ClusterRoleBindings is not available under simple namespace-level privileges, so we could only maybe check the existence of the linkerd namespace, which wouldn't be a very limited check...

@alpeb I think we can manage a lot of this via docs. I'd like to ensure users can run these commands in any order, just to see the output. While the behavior of linkerd install and linkerd upgrade are now reliant on k8s state, I'd like to minimize that surface area.

@l5d-bot
Copy link
Collaborator

l5d-bot commented Apr 19, 2019

@dadjeibaah
Copy link
Contributor

I kinda think we may need to evaluate the user experience more when a user tries to install the control plane without running linkerd install config. When I run the command, I get a ton of

Error from server (NotFound): error when creating "STDIN": namespaces "linkerd" not found

Which may be a little shocking when running install control-plane. WDYT?

@siggy
Copy link
Member Author

siggy commented Apr 20, 2019

Which may be a little shocking when running install control-plane. WDYT?

@dadjeibaah Your comments and @alpeb's are making me think we need to guide the user a bit more via the cli. Maybe we can have linkerd install control-plane return an error if the namespace (and maybe RBAC?) does not exist, and provide a flag to just get the YAML if they really want it:

$ linkerd install control-plane
Error: 'linkerd' namespace does not exist, run 'linkerd install config' first, or 'linkerd install control-plane ignore-cluster' to get the output anyway.

$ linkerd install control-plane --ignore-cluster
---
kind: Service
...

@grampelberg thoughts?

@alpeb
Copy link
Member

alpeb commented Apr 22, 2019

It's not clear which flags are available to each stage. You think it would be too much to have these be totally different commands, like install-config and install-control-plane? Keeping of course the full install command, still with a note explaining you can separate the install using those other two commands.

@siggy
Copy link
Member Author

siggy commented Apr 22, 2019

@alpeb good points re: which flags work for which commands. Creating different commands would make that clearer. I think another approach could be to set these up as proper Cobra subcommands of install.

Either way, I'd like to get this baseline functionality merged as-is, to minimize merge conflicts with ongoing install/inject work. I've filed #2729 to track this.

@siggy
Copy link
Member Author

siggy commented Apr 22, 2019

@alpeb @dadjeibaah @grampelberg latest commit causes linkerd install control-plane to fail is namespace does not exist:

$ bin/go-run cli install control-plane
Failed to find the Linkerd control-plane namespace. If this expected, use the --ignore-cluster flag.
Error: The "linkerd" namespace does not exist

$ bin/go-run cli install control-plane --ignore-cluster
---
kind: ConfigMap
...

@l5d-bot
Copy link
Collaborator

l5d-bot commented Apr 22, 2019

hc := newHealthChecker(
[]healthcheck.CategoryID{healthcheck.KubernetesAPIChecks},
time.Time{},
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIOLI: we might gain some consistency by reusing this same minimalistic healthchecker in exitIfClusterExists. That might need some new hc.CheckConfigMap method to replace the direct call we're making in existIfClusterExists.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, I looked at refactoring this a bit further, then backed it out in the interest of getting this PR through. I captured your comment, and related requirements for healthcheck and client creation via #2735.

Copy link
Contributor

@grampelberg grampelberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • I'd like to expand on #2729 a little bit with different help text as well as flags.
  • @adleong had a good suggestion for a flag that was different than --ignore-cluster, it seemed like a good idea at the time.
  • Should we move all service accounts into stage1? There's no RBAC but as a cluster admin I'd restrict creation of service accounts on a per-namespace basis.

Mostly just some questions/comments, this feels pretty good as it is. Definitely a step in the right direction!

@adleong
Copy link
Member

adleong commented Apr 23, 2019

The check that the Linkerd namespace exists before running linkerd install control-plane feels similar to other pre-command checks (e.g. checking that the control plane is reachable before running stat) so I'd call the flag something like --skip-checks. If I understand correctly, --ignore-cluster is specifically for overriding the linkerd config when running install over an existing install.


err := hc.CheckNamespace(context.Background(), controlPlaneNamespace, true)
if err != nil {
fmt.Fprintln(os.Stderr, "Failed to find the Linkerd control-plane namespace. If this expected, use the --ignore-cluster flag.")
Copy link
Member

@adleong adleong Apr 23, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably our first suggestion should be to run linkerd install config. Maybe something like:

"Failed to find required control-plane namespace: %s. Run linkerd install config -l %s | kubectl apply -f - to create it (this requires cluster administration permissions.) See {website docs link} for more information. Or use {flag} to proceed anyway."

siggy added 8 commits April 23, 2019 10:42
This change introduces two named parameters for `linkerd install`, split
by privilege:
- `linkerd install config`
  - Namespace
  - ClusterRoles
  - ClusterRoleBindings
  - CustomResourceDefinition
- `linkerd install control-plane`
  - ConfigMaps
  - ServiceAccounts
  - Secrets
  - Deployments
  - Services

Comprehensive `linkerd install` is still supported.

TODO:
- `linkerd check` support
- `linkerd upgrade` support
- integration tests

Part of #2337

Signed-off-by: Andrew Seigner <[email protected]>
Signed-off-by: Andrew Seigner <[email protected]>
@siggy siggy force-pushed the siggy/install-stages branch from e2fc912 to 3e7cf8d Compare April 23, 2019 17:49

success := hc.RunChecks(exitOnError)
if !success {
fmt.Fprintln(os.Stderr, "Failed to connect to Kubernetes. If this expected, use the --ignore-cluster flag.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this message needs to be updated as well

Signed-off-by: Andrew Seigner <[email protected]>
@l5d-bot
Copy link
Collaborator

l5d-bot commented Apr 23, 2019

Copy link
Member

@adleong adleong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⭐ 🐶 LGTM!

package cmd

// TODO: this file should probably move out of `./cli/cmd` into something like
// `./cli/publicapi` or `./cli/pkg`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe create an issue and link to it in this TODO?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

filed #2735 yesterday to track, have updated the comment for reference.

Signed-off-by: Andrew Seigner <[email protected]>
Copy link
Member

@alpeb alpeb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me 👍 :shipit:

@siggy siggy merged commit b2b4780 into master Apr 23, 2019
@l5d-bot
Copy link
Collaborator

l5d-bot commented Apr 23, 2019

@siggy siggy deleted the siggy/install-stages branch April 23, 2019 22:39
@grampelberg grampelberg mentioned this pull request Apr 24, 2019
11 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants