-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Prototype for local Ratekeeper #1229
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
There currently exists an alternate mechanism to limit reads to particular storage servers when the queues get large. I think in the replies to various requests, it signals to the client some penalty and then load balancing responds by preferring other storage servers. For the case of local controlling, it may be sufficient for that mechanism to be used instead rather than adding a new one. It still may make sense for ratekeeper to control globally when multiple team members are experiencing sufficient lag. |
ea2e9cf to
7699d44
Compare
This is pretty cool, I wasn't aware of this mechanism. We are currently playing with several mechanisms to figure out what works best. So we'll try something which uses this later. |
fdbserver/storageserver.actor.cpp
Outdated
| Arena lastArena; | ||
| double cpuUsage; | ||
| double diskUsage; | ||
| double currentRate = 1.0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given the name currentRate, I would expect this to be an actual rate and not the probability of a request being accepted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For what it's worth, I don't find it super straightforward from that wikipedia article what your response actually means. I took the initial comment as saying that a different name could more clearly describe what the value was since it doesn't seem to be a straightforward rate. Perhaps the name is defensible, though it's not clear to me why.
|
@fdb-build, ok to test |
|
I don't think this should be tested yet. This is very much in the prototyping phase and is known to not work properly |
f73d85f to
e08fc58
Compare
|
|
||
| double getPenalty() { | ||
| return std::max(1.0, (queueSize() - (SERVER_KNOBS->TARGET_BYTES_PER_STORAGE_SERVER - 2.0*SERVER_KNOBS->SPRING_BYTES_STORAGE_SERVER)) / SERVER_KNOBS->SPRING_BYTES_STORAGE_SERVER); | ||
| return std::max(std::max(1.0, (queueSize() - (SERVER_KNOBS->TARGET_BYTES_PER_STORAGE_SERVER - |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be
| return std::max(std::max(1.0, (queueSize() - (SERVER_KNOBS->TARGET_BYTES_PER_STORAGE_SERVER - | |
| return std::max(std::min(1.0, (queueSize() - (SERVER_KNOBS->TARGET_BYTES_PER_STORAGE_SERVER - |
ff47192 to
91df1bd
Compare
Co-Authored-By: mpilman <[email protected]>
b2abb40 to
b286102
Compare
|
This work is continued in #1477 - so closing this one |
The intention here is to let storage servers throttle incoming reads if
durability lag goes up.
Ratekeeper will pick this up and go down if a whole team sees high NDV.
This is not intended for merging yet, but only for code review and testing.