Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rework helm-charts #592

Closed
brancz opened this issue Aug 29, 2017 · 44 comments
Closed

rework helm-charts #592

brancz opened this issue Aug 29, 2017 · 44 comments
Labels

Comments

@brancz
Copy link
Contributor

brancz commented Aug 29, 2017

I just had a chat with @mgoodness regarding the state and future of the kube-prometheus helm charts, and we wanted to share what our vision is regarding them and we invite anyone who wants to help on this to participate.

We imagine the general structure of the charts to be like this:

The kube-prometheus chart (this is mostly a "meta" chart that includes multiple other charts and adds the Prometheus alerting/recording rules, grafana dashboards as well as service-monitors.

It's dependencies are:

  • Prometheus chart (with opinionated configuration tailored for cluster monitoring, and to work with the alertmanager instance shipped)
  • Alertmanager chart
  • Prometheus Operator chart (because it's a dependency of the prometheus and alertmanager chart)
  • grafana chart (without dashboards, but preconfigured with the k8s prometheus as datasource) (this one is already actively maintained by @weiwei04)
  • kube-state-metrics chart
  • node-exporter chart

Something that we had previously done in the kube-prometheus charts is also consider the choice of how the web UIs of the above components are exposed, we'd like to move away from this and leave this completely up to the user.

If anyone is willing to work on any of these charts, contributions are highly appreciated and very welcome. The individual charts that are a dependency of the kube-prometheus chart should be relatively low maintenance as the upstream projects are not released that often.

@MattiasGees
Copy link
Contributor

This will be a really good way forward.
The place I am working at for the moment is a heavy kubernetes, prometheus and helm user. I am willing to help with development of the HELM chart.
I think some of my colleagues are also willing to help with pushing the helm-charts forward and helping out the community.

Kind regards,
Mattias

@brancz
Copy link
Contributor Author

brancz commented Sep 8, 2017

That sounds fantastic! Let me know if I help you in any way to get started 🙂 .

@ringods
Copy link

ringods commented Sep 8, 2017

And @MattiasGees, don't be shy to say we are working for Skyscrapers. 😄

@kevinjqiu
Copy link
Contributor

@brancz At my $DAYJOB, we're currently using the kube-prometheus manifests to deploy our prometheuses, alerts and dashboards. It's been great but we're looking into rewriting the manifests to helm charts just like you've described. We'd like to contribute to this too if you could point us to a place to get started.

Our background:

  • using Kubernetes 1.6 in prod for a couple of months
  • using prometheus, alert manager and grafana in multiple environments

Let me know :)

Thanks!

@ptagr
Copy link
Contributor

ptagr commented Sep 13, 2017

similar use case to @kevinjqiu . would definitely contribute back once our infra is set up.

@kevinjqiu
Copy link
Contributor

kevinjqiu commented Sep 13, 2017

@Punitag We have extended prometheus-operator with a new CRD AlertRule which is a simple object wrapper a prometheus rule. The operator syncs a ConfigMap for the AlertRule which gets picked up by the prometheus instance.

see #616

I'm not sure if this fits the vision of the project, but I think it could be a useful addition.

@eedugon
Copy link
Contributor

eedugon commented Sep 14, 2017

Hi @brancz : I can also collaborate with the helm charts for all the components, trying to make them as much flexible and configurable as possible (via values.yaml overwriting).

I think one of the most powerful features of helm is to create different manifests based on the input parameters, for example adding / removing containers (sidecars) based on the configuration, setting up resources (memory, specs, services, etc) based on configuration. I will share some examples I have created soon (i.e kafka with jmx_exporter and filebeat sidecars as options, supporting 1.6 and 1.7 statefulsets possibilities, etc).

I would say the objective should be to be able to install kube-prometheus via a helm charts with the default configuration, but also being able to prepare your own env-setup.yaml with specific configuration to modify the behavior / outcome of the final manifests, and all the possibilities we provide should be documented.

For example, we could have one setting in the yaml like .Values.grafana.service.enableLoadBalancer . If that setting is set to true, helm would create during install or upgrade a service with type LoadBalancer for external access. If it's set to false helm won't create the service at all.
That's just an example, to see if we are aligned.

I would say that just getting exactly the same from helm install or kube-prometheys hack/scripts should be only the first step, because we should aim to support as much configurable settings as we could.

@eedugon
Copy link
Contributor

eedugon commented Sep 14, 2017

another example / idea.... :)
Prometheus persistent storage should be a simple option in the yaml for helm, so any user could enable persistence by just enabling something and setting the size of the persistent volume claim to create.
That's very easy to achieve too and I have it already written.

@brancz
Copy link
Contributor Author

brancz commented Sep 14, 2017

I think any mean of accessing/exposing web UIs for example should be completely left to the user consuming the chart, so any Service, Ingress or whatever is not something that belongs in kube-prometheus or even less in the prometheus chart. Otherwise all means of accessing will have to be re-implemented in the charts, that feels like it defeats the purpose.

As I mentioned before I think we first need to make sure the "leaf" packages are robust. For example the kube-state-metrics helm chart needs options to disable certain collectors, which in turn reduces the required RBAC roles, and include the addon-resizer. As a kube-state-metrics helm chart already exist in the upstream stable charts, this would even make sense to contribute back there - the less charts have to be maintained here the better! Essentially the kube-state-metrics chart should only have the Deployment, RBAC roles and a ClusterIP type Service.

Alternatively and a bit more simple would be to create a chart for the node-exporter that really only contains the node-exporter and a ClusterIP type Service. Same as the kube-state-metrics chart, this is probably even a candidate for the stable upstream charts.

Just to make sure I want to reiterate, that everyone is very welcome to contribute anything discussed here 🙂 . Thanks everyone for wanting to contribute, I'm hopeful we'll get all the charts into a good shape!

@tsloughter
Copy link
Contributor

Related to this issue and #520, I wanted to mention that being able to add dashboard with helm install and upgrade would be very useful. Not being able to do that is what led me to looking at loading dashboards automatically through the watcher or operator. But really since they don't change except when a chart is installed or upgraded there is no reason for it, except for the fact that there is currently no way for installing a chart to update the grafana dashboards of an existing grafana instance.

This is likely restricted by the abilities of helm charts at this time, but something I wanted to mention anyway :)

@gianrubio
Copy link
Contributor

gianrubio commented Nov 22, 2017

I'm going to work on the reorganisation of the helm charts, the main task are:

  • Migrate prometheus to v2.0

the less charts have to be maintained here the better!

  • Point dependencies like kube-state-metrics to kubernetes/charts repo, also figure out if there's more dependencies that can be pointed to the k8s chart repo.
  • Keep kube-prometheus/{rules, grafana, alert} in sync with kube-prometheus chart
  • Move kube-state-metrics to stable
  • e2e test to the helm charts
  • Write documentation for helm/kube-prometheus
  • Create helm chart to metrics-server and add as dependency to kube-prometheus

Did I miss something?

@brancz
Copy link
Contributor Author

brancz commented Nov 23, 2017

The upstream kube-state-metrics chart needs some more changes, but there is already a stable one. I outlined some things that should be done to it here: helm/charts#2124

@gianrubio
Copy link
Contributor

Charts finally got synced with s3, everyone can enjoy prometheus 2.0 using helm :)

$ helm repo add coreos https://s3-eu-west-1.amazonaws.com/coreos-charts/stable/ …
$ helm install coreos/prometheus-operator --name prometheus-operator
$ helm install coreos/kube-prometheus --name kube-prometheus --set rbacEnable=true

@JMLizano
Copy link

Hi,
don't know if currently there is any on going work on this, but I think it is a great idea. I have started to work in the development of the standalone alertmanager chart, and I have come to a point where I am not sure how to go on. Here are my questions/decisions to the moment:

  • Offer two installing modes, configurable through 'deploymentMode' parameter on values.yaml:
    • Install as a statefulSet: For the case when you don't want to use PrometheusOperator
    • Install through PrometheusOperator: As the current helm chart in this repo does. This opens a few questions for me:
      • Should this option only be used by umbrella charts (like kube-prometheus) or used on it's own?
      • If the latter, how should we manage the 'inherited' parameters like prometheus label on ServiceMonitor or Prometheus alertmanager rules configmap?

I already have a initial version and I intend to do a PR to incubator repo of official charts repo after cleaning up a bit. Will reference this issue there to maintain tracking.

@vsliouniaev
Copy link
Contributor

This is merged now and using the grafana sidecars

@mvisonneau
Copy link

I can see that prometheus-operator has been merged. I cannot find equivalent of kube-prometheus is that expected ? 🤔

@metalmatze
Copy link
Member

The prometheus-operator chart is actually more than just the operator and there is some discussion going on whether to rename the chart to kube-prometheus right away. I might still be missing parts though, I wasn't involved in the process.

@mvisonneau
Copy link

Indeed after looking into it I figured out! thanks

@shuraa
Copy link

shuraa commented Nov 7, 2018

@metalmatze how can I upgrade coreos/prometheus-operator (0.0.29) to stable/prometheus-operator (0.1.15)?

helm ls

NAME               	REVISION	UPDATED                 	STATUS  	CHART                     	APP VERSION	NAMESPACE
kube-prometheus    	1       	Thu Oct 18 09:07:19 2018	DEPLOYED	kube-prometheus-0.0.105   	           	monitoring
prometheus-operator	1       	Fri Sep 21 22:19:24 2018	DEPLOYED	prometheus-operator-0.0.29	0.20.0     	monitoring

@vsliouniaev
Copy link
Contributor

@shuraa there is not a direct upgrade path - the chart in stable/prometheus-operator has a different structure you may be able to do some upgrades depending on what your circumstances are:

  • The stable/prometheus-operator chart includes both coreos/kube-prometheus and coreos/prometheus-operator charts and all their dependencies too.
  • The cleanest way is if you can start from scratch - Delete both releases and install stable/prometheus-operator
  • If you want to keep some components then:
    • Don't delete any of the CRDs and set prometheusOperator.createCustomResource: false in the new chart, prometheus-operator should handle upgrades there itself when the new pod comes up
    • If you are using persistent storage you can specify the existing storage and attach it to the new Prometheus instances.
    • If you are creating custom servicemonitors as part of the charts you can use those there also under prometheus.prometheusSpec.additionalServiceMonitors

@shuraa
Copy link

shuraa commented Nov 8, 2018

@vsliouniaev, Vasily, thank you! I'll try to test your instructions in a vagrant environment.

@lcostea
Copy link

lcostea commented Nov 22, 2018

@vsliouniaev What about putting your answer into the readme for the stable/prometheus-operator.

There is still some confusion between prometheus-operator and kube-prometheus like what should you install on the cluster, why kube-prometheus was not ported to stable/charts. We faces these issues when we started a few weeks ago and it wasn't easy to figure out the answers. So these could be really helpful.

@brancz
Copy link
Contributor Author

brancz commented Nov 22, 2018

Additionally, do you think it would be possible to have each of the components separated on the upstream charts, and actually call the collection "kube-prometheus"? It's causing even more blurred lines between what makes kube-prometheus and what is just the prometheus-operator.

@larsduelfer
Copy link

@vsliouniaev What about putting your answer into the readme for the stable/prometheus-operator.

There is still some confusion between prometheus-operator and kube-prometheus like what should you install on the cluster, why kube-prometheus was not ported to stable/charts. We faces these issues when we started a few weeks ago and it wasn't easy to figure out the answers. So these could be really helpful.

As far as I understand the prometheus-operator chart available on helm/stable repo contains the prometheus-operator content as well as the kube-prometheus content from the coreos repo.

I have raised this question here yesterday, because we had lots of confusion with the current duplication as well: #2153

@chrisob
Copy link

chrisob commented Nov 22, 2018

+1 for have some sort of documentation on an upgrade path/migration steps.

I'm in the middle of migrating right now, and other than this GH issue, there's not much documentation on what the current state is. So much has changed (alert rules, Grafana dashboards, removal of kube-prometheus, etc.), that the new chart location is essentially a new Helm chart entirely, and I've had to essentially drop and recreate our entire monitoring stack :(

Don't get me wrong, I think the new chart is structured better than the old one, but migration isn't a simple helm upgrade, or changing a few Helm values and then upgrading.

@richerve
Copy link
Contributor

richerve commented Dec 4, 2018

In my opinion kube-prometheus should probably never exists upstream because it is basically a meta Chart to get all other dependencies with just a few a minor resources included for monitoring. So it could exists in a repo like this as a "recommended" setup. As @metalmatze commented it's confusing because now upstream prometheus-operator Chart is sort of what kube-prometheus was, that in my opinion is an error.

Formally this is one of the few reasons why Helm Charts are just a pain to maintain and too opinionated in what to include and what to expose in values.yaml, so basically right now people are stuck with rebuilding their whole Prometheus stack from scratch because of this.

@brancz
Copy link
Contributor Author

brancz commented Dec 4, 2018

Agreed on all points @richerve (if you do end up "building the entire stack from scratch", do have a look at the jsonnet based alternative 🙂 ).

@maver1ck
Copy link

I don't understand why migration process renamed half of the variables and make changes so incompatible.
Doing a migration is pain in the ass.
I failed twice up to this moment. Every time I'm finding new blocker.

@robbie-demuth
Copy link

Are there plans to once again provide separate charts for the main subcomponents of prometheus-operator such as prometheus and alertmanager? It used to be useful to be able to declare dependencies on these charts for easily incorporating a Prometheus or AlertManager instance into another chart. This of course required prometheus-operator to already be installed in the cluster so that the CRDs and such were already available. What's the recommended way to rope in these templates now? Declaring a dependency on prometheus-operator seems too heavy since in the use case I'm trying to address, it is already deployed separately in the cluster. That means that most of the values exposed in the new chart would have to be toggled off. It seems to me that it'd be much cleaner to just declare a dependency on separate prometheus and alertmanager charts. Does that make sense?

@vsliouniaev
Copy link
Contributor

We're using the single chart with togging components on and off. Doing what you suggest would require another quite significant migration IMO

@stale
Copy link

stale bot commented Aug 14, 2019

This issue has been automatically marked as stale because it has not had any activity in last 60d. Thank you for your contributions.

@stale
Copy link

stale bot commented Oct 20, 2019

This issue has been automatically marked as stale because it has not had any activity in last 60d. Thank you for your contributions.

@stale stale bot added the stale label Oct 20, 2019
@metalmatze
Copy link
Member

All further discussion should be done in other more specific issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests