-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding to Grafana dashboards #520
Comments
We don't have a channel dedicated to the Prometheus Operator, however, lots of people use the As I don't use the grafana helm chart I cannot comment reliably on the functionality. As far as I understand, the grafana chart should be independent of the dashboards it is deployed with and only when deploying the @weiwei04 being the maintainer of this chart, he can probably more reliably answer your questions. |
@brancz thanks! Sure, the issue is, at least for me, wanting to have an unmodified kube-prometheus and grafana charts that get installed and then when I install another app through a chart that needs specific dashboards they have to manually added each time. Or grafana has to be a dep of each chart, and then include its dashboards, and then the user switches grafanas to see the different dashboards. A separate issue is it seems |
Here's my current thought: In our user scenario, there are two kinds of users: 1). the k8s administrator, only cares about k8s system metrics and some addon metrics(ingress, storage-class provisioner etc). 2). the k8s user who cares about their containers resource metrics and some other business metrics. The need for dashboarding may be divided into a), b), c), d): For a). b). c), maybe a grafana-operator(define Kind: Grafana, Kind: GrafanaDashboard), and apps chart include a For d). I think user may be create their dashboard from scratch. For now, the manual solution to reuse a grafana instance to include more dashboard is to use grafana webui import.
|
There was already some thought on adding Grafana support to the Prometheus Operator, see: #115. Our general idea around using Grafana is that it needs to be completely stateless. Therefore any solution involving web UI import is highly discouraged (same applies for the current Grafana + grafana-watcher usage). It is important to understand that when a Grafana Pod dies, it will not have the same database, but it will again be provisioned from the The idea is that new dashboards are either created through a meta language (still to be implemented, but possibly weaveworks/grafanalib), or via the web UI, and immediately after completion exported (and deleted) via the web UI and then stored in some form of version control, from where the dashboards will then be consistently deployed to the ConfigMap and then to the Grafana instances through the established workflow with the grafana-watcher. In that sense, the Grafana chart itself needs to be completely bare-bone, and not make any assumptions, other than that it will get datasources and dashboards which are to be provisioned. We should not always include the pre-built kube-prometheus dashboards, those should be values that are chosen by a user when deploying that chart. To recap, the grafana-watcher was implemented so that Grafana is stateless, therefore importing through the web UI is highly discouraged. |
Thanks for the replay, my mistake. We use pvc(backed by ceph rbd) to store grafana state, if the Pod dies, the same pv will attach to the new grafana Pod, all the state are recovered. Yes, importing through the web UI is not apply everyone and should highly discouraged. Make grafana complete stateless, by use configmaps to persistent the dashboards and use grafana-watcher to provision grafana suits more people, I'll keeps this in mind :) Back to the question
I should move pre-installed serverDashboardsFiles into kube-prometheus values.yaml and keep grafana chart clean.
|
That sounds perfect. That way we can ship sensible defaults with the kube-prometheus chart(s), but still leave them bare bone. Essentially as I view kube-prometheus, the only thing the "package" should include are the things that make the overlaying dashboards/rules work, eg. the Kubernetes services, and plugging together all the components like Prometheus and Alertmanager. These dashboards and rule files should purely be values that are configured by each user, but we can ship sensible but extensible example defaults. @weiwei04 from the issues that people are having it seems that the kube-prometheus charts need some work, if you are interested in cleaning those up, I'm more than happy to move forward to progress on that end with your help. Thanks for all your help @weiwei04 ! And always good to get a healthy discussion started @tsloughter, thanks for kicking off this one! |
Basically what I'm saying is, if you two could help review the helm chart PRs and issues we'd highly appreciate that as we're not using helm ourselves, but would still love to ship high quality packages. We're just lacking contributors on the helm end. Let me know how I can help. |
Yes, I'd like to help maintain the kube-prometheus charts(PRs and issues are welcome), but I don't have a public available helm registry to publish these charts, since the charts in kube-prometheus are in @mgoodness registry. |
@brancz yea, I'm looking for a stateless solution as well. Happy to help any way I can (I still need to read through this whole thread). I'm currently working on a blog post for our company website that details using these helm charts and grafana dashboards was one of the last pieces to figure out. "Simplest" solution (not sure how simple it actually would be) I can think of is to either build a single config map from selecting other config maps that declare dashboards, which is then mounted to the pod. Or building a list of configmaps to mount (assuming the grafana program can accept multiple directories of dashboards?) and modifying the grafana deployment to mount them and add to the list of directories passed to the |
That's one thing where having a controller/operator would be useful 🙂 . It's essentially the same thing as we're doing with rule files in the Prometheus object. |
I'm very interested on this topic (more on the grafana overall solution you are discussing than the helm charts, but both are important). Regarding the grafana dashboards and for short term solution (to allow users to add / remove dashboards easily), I would propose a bash script/tool that puts all dashboards into the ConfigMap yaml definition, and then updating the ConfigMap with kubectl should apply the changes. The tool could also allow "adding" or "removing" dashboards from the ConfigMap definition. Or even dumping the configMap definition from a running system just in case the dashboard files are lost. But as @weiwei04 has mentioned... if there's a size limit in the config map we might reach it. I think I will have to build the tool anyway to avoid using "make generate" for that activity.... unless you know a better procedure at the moment. |
Yea, an operator/controller would be best since it would work no matter what way grafana is installed to kubernetes. It shouldn't be a very complicated task to do for grafana (compared to prometheus/alertmanager/etcd), right? If no one else has started on it I will take a wack at it. Any pointers on if it should just be a custom controller or an operator, and links to any resources that might be helpful are much appreciated :) |
@tsloughter we've thought about actually integrating that directly into the Prometheus Operator (which for would make the name slightly odd, so it might actually be best to develop outside. The great thing about having it in the Prometheus Operator would be that we can make a lot more assumptions for it's usage, but given we're all on this thread I think we agree on the datasource type 🙂 . The size limit is exactly what we hit with Prometheus rule files as well, which is why we made it a label selector, I'd imagine to do the exact same thing. Other than that I agree @tsloughter, running grafana, especially in a stateless way is much much easier than Prometheus/Alertmanager/etcd as we by definition don't have to worry about state, it basically just templates the Deployment with the correct ConfigMaps. Let me know when you've thrown something together @tsloughter ! Also I'd be happy to discuss design decisions as we've already thought about this a lot. @eedugon regarding:
This sounds like it could be an additional script in the
|
to install it's:
|
#558 open a pr to delete pre-installed dashboards for kube-prometheus. and thanks @ant31 for help to set up a registry, now the grafana chart can install with
But the quay.io registry use a private spec different with the helm http registry spec, for now, I have't find out a proper way to include grafana into kube-prometheus. I will try to think other ways. |
I was out for a week but now looking at this again :). I noticed some PRs/issues that look related to grafana dashboards since then and will first look through those, but let me know if anyone knows of ones that definitely relate. |
Cool. I'll look at those. But after looking at it I don't see myself getting around to working on an operator for this anytime soon, so hopefully someone else is interested :). Also, I wrote a short blog post about how we are using the operator and helm charts and installing dashboards for now https://spacetimeinsight.com/installing-monitoring-erlang-releases-kubernetes-helm-prometheus/ -- and if anyone reads it and see something wrong with how I'm doing the operator configuration please let me know :) |
Is there a reason the If the watcher watched for new config maps (or changes to existing ones) and checked if they were grafana dashboards, then importing them through the grafana api directly from the configmap this would solve the issue of only currently having one source for dashboards. |
So I did a quick thing, https://github.com/tsloughter/grafana-operator This is what I was thinking in terms of loading the dashboards from Not sure if it makes sense as a sidecar or as part of |
The reason why the grafana-watcher is a side-car is because this way it only needs to worry about consistency with the files on disk and the single "local" grafana instance. Whereas when you were to build this into an operator then a) that operator needs to perform all of those requests to all grafana servers, which is a NetworkPolicy I wouldn't want in my cluster b) the Operators are scoped to managing Kubernetes objects rather than interacting directly with the software. A grafana Operator would be neat anyways, because we can handle higher level concerns, like how do we handle multiple configmaps being mounted, sane deployment strategies, etc. (whether it would belong in the Prometheus Operator is an open question) |
Ok, gotcha. I'm going to continue working on the grafana operator when I have time. Hopefully others who are interested in Grafana being handled this way will see this issue and help out :). Going to add a list of what needs to be done for it to be actually usable to the readme. |
With the recently added scripts by @eedugon I'm a bit skeptic whether a full blown Grafana Operator might be overdoing it, the scripts automatically split up the Grafana Dashboard Json definitions into multiple With all this tooling in place all one needs to do is export a Grafana dashboard through the UI and drop it into the grafana assets folder and run I feel we're at a pretty good point in terms of tooling around this, so I'll close this issue at this point, but feel free to re-open or continue the discussion if you feel there is a need 🙂 . |
Ok. I disagree, but understand :). I'd want dashboards to work after a |
We figured that that would also be possible by tweaking the grafana-watcher slightly, similar to how the prometheus-config-reloaded works. |
@brancz Was there further work/thought on slightly tweaking the grafana-watcher to allow for annotations to add dashboards? A PR that my searching for isn't coming up with perhaps? I would really like to use this in conjunction with a helm deployed grafana(watcher) and a dashboard deployed by an app. I understand the network policy concern as a singleton deploy of grafana, however my current setup is more towards allowing app devs to modify their own dashboards according to their needs and the platform devs (myself included) would worry about managing the rest of the stuff to make that work. i.e. network/rbac policies, separate (or not) grafana's and prometheus' Again this is likely not a concern of the prometheus operator, but I am looking for a place that further work/discussion might have taken place. |
@japaniel modifying existing dashboards is not a problem - annotations included. The process may just be a bit different than what you're doing today. The process how I recommend doing it is modifying the dashboard either via UI and export the result or the source directly (we're going to be working on some neat things here, so stay tuned). Then you open a pull request against your infra repo, or whereever the dashboard is versioned. Once merged it's deployed from disk as any other dashboard. Where the experience lacks a bit today is the modifying + export in the UI as well as the text editing one, but this is going to get better 🙂 . |
First, is there a chat where operators are discussed? Questions like this I would first bring there if there was a gitter/slack/irc channel for discussing operators. I try to bring the helm specific questions to the helm-users channel on slack first, but often can't find answers.
So I'm trying to figure out a good way of adding dashboards to a grafana deployed through the prometheus-operator's grafana helm chart. When I had been deploying prometheus-operator through
kube-prometheus
scripts I would have my application's helm chart write over the grafana configmap with a new one. However, this is not possible when grafana was deployed with a separate helm chart.I wanted to know if anyone had solved this issue. Not necessarily a full solution for dynamically adding dashboards for resources, but at least being able to recreate/rewrite what grafana's dashboard volume is made of.
The text was updated successfully, but these errors were encountered: