-
Notifications
You must be signed in to change notification settings - Fork 1.4k
[pinot docker image] Upgrade prometheus and scope rulesets to components #8227
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
walterddr
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks great. btw we shouldn't check in a binary jar to repo in the first place :-)
you're telling me? |
Codecov Report
@@ Coverage Diff @@
## master #8227 +/- ##
============================================
- Coverage 71.06% 71.00% -0.07%
- Complexity 4320 4322 +2
============================================
Files 1626 1626
Lines 85036 85076 +40
Branches 12795 12802 +7
============================================
- Hits 60428 60405 -23
- Misses 20456 20518 +62
- Partials 4152 4153 +1
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
|
Adding the cache expressions needs thorough testing as it is not appropriate to all metric matching logic, I suggest delaying caching |
|
Makes a lot of sense! Thanks Richard! |
This is my bad :p |
|
I figured out that the much bigger problem here is that all the rules are evaluated on all the instance types, so we are running controller rules on the brokers, controller + broker on the servers, controller + broker + server rules on the minions before matching. We should break this down before considering caching. |
20b2b40 to
fd1de54
Compare
docker/images/pinot/etc/jmx_prometheus_javaagent/configs/broker.yml
Outdated
Show resolved
Hide resolved
f07119e to
55ce55f
Compare
…nts (apache#8227) * upgrade prometheus and enable caching * revert cahcing for now * download prometheus 0.12.0 for backward compatibility * separate config files to reduce scope of rules to appropriate services * newlines
We frequently see that prometheus is a resource hog in our customer deployments, with regular expression matching leading to excessive allocation and CPU time. Pinot currently uses version 0.12.0, which is hardcoded into deployments and the jar is checked in.

Prometheus fixed this here, allowing : prometheus/jmx_exporter#518
This PR changes this so that the latest version (pinned at 0.16.1) of jmx_prometheus_javaagent is downloaded in the docker image, and caching is enabled for our rules.