-
-
Notifications
You must be signed in to change notification settings - Fork 750
Long running occupancy #5395
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Long running occupancy #5395
Conversation
e1c1ee9 to
c15051c
Compare
486719f to
ab517ea
Compare
gjoseph92
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, though I'm a little curious why the task needs to stay in processing on the WorkerState
|
|
||
| @ccall | ||
| @exceptval(check=False) | ||
| def _reevaluate_occupancy_worker(self, ws: WorkerState): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What was the reason for moving this to be a method instead of a standalone function? I agree methods are more readable, just not sure about the Cython consequences cc @jakirkham
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just readability
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both will work. However there was extra overhead from the method call vs. the function call, which is why it was moved to a function. Given this gets called a lot, that overhead mattered in our benchmarks.
ab517ea to
a151a4b
Compare
| self.scheduler._reevaluate_occupancy_worker(thief) | ||
| self.scheduler._reevaluate_occupancy_worker(victim) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since it looks like _reevaluate_occupancy_worker was removed (even as a method), should we be doing something else here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh, I missed this one. I think this is why I put it in as a mathod to avoid cyclic imports
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW did like the idea you had of getting rid of this function if that is an option. Though admittedly there might need to be more thought on this test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Intuitively, it feels like this should not even be here since this interacts pretty deeply with the scheduler and the stealing thing should be an extension and not interact on this deep level. However, the only way to not do this here would be to solely rely on the eventual update via a callback. No idea how disruptive this would be
0eda3c9 to
a151a4b
Compare
We're removing long running tasks from a workers occupancy. However, occasionally we're recalculating the worker occupancy to account for new information since the occupancy gets inaccurate over time. However, this reevaluation does not take into account
long-runningtasks since the scheduler does not further break the stateprocessingdown. Similar for the substateexecuting, we're tracking this on a per-worker base and we should do the same for thelong-runningsubstateThis builds on top of #5392 since I pulled together the occupancy calculation over there.
Fixes of this PR in last commit 9c20b32
pre-commit run --all-files