Skip to content

Focus install by role and stage #2337

@grampelberg

Description

@grampelberg

Task Breakdown

Tasks we may or may not do

  • disable linkerd install
  • install-sp integration
  • install-cni integration

Previous

What problem are you trying to solve?

Today, Linkerd has a all-in-one install via linkerd install. As features have been added, this is becoming to be a larger problem because:

  • CRDs and their resource definitions cannot be applied at the same time. This has created a new install-sp command that users need to run after install if they would like service profiles for the control plane. While control plane service profiles are optional right now, it is likely that they will be required in the future to support new functionality.
  • Teams run on locked down clusters where there are two different individuals. One is a cluster admin that can add things such as CRDs and ClusterRoles and one is a namespace admin that can add Roles and Deployments. There is a hypothesis that cluster admins do not want to operate the components of Linkerd, simply audit that namespace admins won't hurt the cluster.
  • The CNI plugin introduces a step where it must be installed before the control plane components themselves.

How should the problem be solved?

To address these issues, the install should be separated into multiple phases. By using install subcommands to implement these phases, linkerd check --pre can be integrated into the install experience. As a tldr:

$ linkerd install config | kubectl apply -f -
$ linkerd install components | kubectl apply -f -

Pre-install experience

  • Move check --pre into the existing install command.
  • Order linkerd-version checks first instead of last.

For users without cluster-admin:

$ linkerd install
linkerd-version
---------------
√ can determine the latest version
√ cli is up-to-date

kubernetes-api
--------------
√ can initialize the client
√ can query the Kubernetes API

kubernetes-version
------------------
√ is running the minimum Kubernetes API version

pre-kubernetes-cluster-setup
----------------------------
√ control plane namespace does not already exist
× can create Namespaces
× can create ClusterRoles
× can create ClusterRoleBindings
× can create CustomResourceDefinitions

pre-kubernetes-setup
--------------------
√ can create ServiceAccounts
√ can create Services
√ can create Deployments
√ can create ConfigMaps
√ can use NET_ADMIN

Status check results are ×

It appears that the current user does not have the required permissions. Take a
look at https://linkerd.io/2/install/#permissions for some details on what's
required and ways around it.

This doc would explain what permissions are required, suggest running linkerd install config and recommend passing the YAML to a cluster-admin.

Users that do have the correct privileges but run on a cluster with restrictive PSP:

$ linkerd install
linkerd-version
---------------
√ can determine the latest version
√ cli is up-to-date

kubernetes-api
--------------
√ can initialize the client
√ can query the Kubernetes API

kubernetes-version
------------------
√ is running the minimum Kubernetes API version

pre-kubernetes-cluster-setup
----------------------------
√ control plane namespace does not already exist
√ can create Namespaces
√ can create ClusterRoles
√ can create ClusterRoleBindings
√ can create CustomResourceDefinitions

pre-kubernetes-setup
--------------------
√ can create ServiceAccounts
√ can create Services
√ can create Deployments
√ can create ConfigMaps
× can use NET_ADMIN

Status check results are ×

The current cluster has a restrictive PodSecurityPolicy that does not allow
pods to use NET_ADMIN. Take a look at https://linkerd.io/2/install/#psp.

The doc in question would then recommend users continue with the install but run linkerd install forwarding alongside config and components with the requisite flags.

$ linkerd install
linkerd-version
---------------
√ can determine the latest version
√ cli is up-to-date

kubernetes-api
--------------
√ can initialize the client
√ can query the Kubernetes API

kubernetes-version
------------------
√ is running the minimum Kubernetes API version

pre-kubernetes-cluster-setup
----------------------------
√ control plane namespace does not already exist
√ can create Namespaces
√ can create ClusterRoles
√ can create ClusterRoleBindings
√ can create CustomResourceDefinitions

pre-kubernetes-setup
--------------------
√ can create ServiceAccounts
√ can create Services
√ can create Deployments
√ can create ConfigMaps

Status check results are √

To install Linkerd on your cluster, run:

  linkerd install config | kubectl apply -f -
  linkerd check config
  linkerd install components | kubectl apply -f -
  linkerd check

On any cluster that is completely open, a user will be able to cut/paste the commands and get a very similar one shot experience that exists today.

Cluster Config

The first step would contain all the resources that require cluster-admin. This primarily becomes RBAC that can be handed off to a cluster-admin or audited. I'm proposing a couple other changes:

  • Inject becomes a default install (or at least the roles and bindings).
$ linkerd install config | kubectl apply -f -
namespace/linkerd created
serviceaccount/linkerd-ca created
serviceaccount/linkerd-controller created
serviceaccount/linkerd-grafana created
serviceaccount/linkerd-prometheus created
serviceaccount/linkerd-proxy-injector created
serviceaccount/linkerd-web created
clusterrole.rbac.authorization.k8s.io/linkerd-linkerd-ca created
clusterrole.rbac.authorization.k8s.io/linkerd-linkerd-controller created
clusterrole.rbac.authorization.k8s.io/linkerd-linkerd-prometheus created
clusterrole.rbac.authorization.k8s.io/linkerd-linkerd-proxy-injector created
clusterrolebinding.rbac.authorization.k8s.io/linkerd-linkerd-ca created
clusterrolebinding.rbac.authorization.k8s.io/linkerd-linkerd-controller created
clusterrolebinding.rbac.authorization.k8s.io/linkerd-linkerd-prometheus created
clusterrolebinding.rbac.authorization.k8s.io/linkerd-linkerd-proxy-injector created
customresourcedefinition.apiextensions.k8s.io/serviceprofiles.linkerd.io created

There needs to be a way to verify that someone has run linkerd install config on a cluster. This makes sure everything setup correctly and handles the potential issue of a asynchronous addition (emailing the YAML for example).

For a failure case:

$ linkerd check config
√ has linkerd Namespace
× has all ServiceAccounts
√ has all ClusterRoles
√ has all ClusterRoleBindings
√ has CRD

Check results are ×

Missing serviceaccount/linkerd-web, run linkerd config to fix.

For success:

$ linkerd check config
√ has linkerd Namespace
√ has all ServiceAccounts
√ has all ClusterRoles
√ has all ClusterRoleBindings
√ has CRD

Note: this check doesn't feel right, the command and the output are both odd and don't match other commands. Any good suggestions?

Forwarding

Install the CNI plugin in preparation of components.

$ linkerd install forwarding --dest-cni-bin-dir /home/kubernetes | kubectl apply -f -
namespace/linkerd configured
serviceaccount/linkerd-cni created
clusterrole.rbac.authorization.k8s.io/linkerd-cni created
clusterrolebinding.rbac.authorization.k8s.io/linkerd-cni created
configmap/linkerd-cni-config created
daemonset.extensions/linkerd-cni created

Then check:

$ linkerd check forwarding
√ cluster config is compatible
√ is running the DaemonSet
√ has configured CNI on every node in the cluster
√ pods are correctly having traffic forwarded

Components

This command can be run by namespace admins and allows them to manage the components themselves.

  • --single-namespace gets folded into this command. The primary difference currently is between ClusterRoles and Roles. The biggest question is whether CRDs will become required by the control plane in the future.
  • The namespace is only in config and not components.
$ linkerd install components | kubectl apply -f -
serviceaccount/linkerd-ca unchanged
serviceaccount/linkerd-controller unchanged
serviceaccount/linkerd-grafana unchanged
serviceaccount/linkerd-prometheus unchanged
serviceaccount/linkerd-proxy-injector unchanged
serviceaccount/linkerd-web unchanged
role.rbac.authorization.k8s.io/linkerd-linkerd-ca created
role.rbac.authorization.k8s.io/linkerd-linkerd-controller created
role.rbac.authorization.k8s.io/linkerd-linkerd-prometheus created
role.rbac.authorization.k8s.io/linkerd-linkerd-proxy-injector created
rolebinding.rbac.authorization.k8s.io/linkerd-linkerd-ca created
rolebinding.rbac.authorization.k8s.io/linkerd-linkerd-controller created
rolebinding.rbac.authorization.k8s.io/linkerd-linkerd-prometheus created
rolebinding.rbac.authorization.k8s.io/linkerd-linkerd-proxy-injector created
service/linkerd-controller-api created
service/linkerd-grafana created
service/linkerd-prometheus created
service/linkerd-proxy-api created
service/linkerd-proxy-injector created
service/linkerd-web created
deployment.extensions/linkerd-ca created
deployment.extensions/linkerd-controller created
deployment.extensions/linkerd-grafana created
deployment.extensions/linkerd-prometheus created
deployment.apps/linkerd-proxy-injector created
deployment.extensions/linkerd-web created
configmap/linkerd-grafana-config created
configmap/linkerd-prometheus-config created
configmap/linkerd-proxy-injector-sidecar-config created

With a final check for good luck:

$ linkerd check
kubernetes-api
--------------
√ can initialize the client
√ can query the Kubernetes API

kubernetes-version
------------------
√ is running the minimum Kubernetes API version

linkerd-existence
-----------------
√ control plane namespace exists
√ controller pod is running
√ can initialize the client
√ can query the control plane API

linkerd-api
-----------
√ control plane pods are ready
√ can query the control plane API
√ [kubernetes] control plane can talk to Kubernetes
√ [prometheus] control plane can talk to Prometheus

linkerd-service-profile
-----------------------
√ no invalid service profiles

linkerd-version
---------------
√ can determine the latest version
√ cli is up-to-date

control-plane-version
---------------------
√ control plane is up-to-date
√ control plane and cli versions match

Status check results are √

Helm Chart

The helm chart should be separated into similar sections so that it could be installed at once or in each of these pieces. The crd-install hook must be used, otherwise it should map 1:1 with the proposed install flow.

Any alternatives you've considered?

See #2164 for a lot of discussion and alternatives.

Open Questions

  • What is the impact on --linkerd-namespace for each step? The role bindings in config will need to be namespace specific and it would be nice to allow users to operate multiple versions of Linkerd in different namespaces (impacting the roles as well).
  • linkerd check config doesn't feel quite right. It would be weird to have a linkerd install run fail on half the things and still be okay though. Maybe the failures become warnings if config/cni is running correctly?
  • How should the roles and components be separated for linkerd install forwarding. The same person would be doing it and there are no ordering requirements.
  • The separation between config and components theoretically suggests that namespace owners can upgrade without running both commands. Is that a use case that should be encouraged?

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions