-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Network toplogy aware scheduling support in Volcano scheduler: Populate hyperNode tree in cache #3881 #3922
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Welcome @penggu! |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
pkg/scheduler/cache/cache.go
Outdated
| hnfb.nodes[hn.Name] = hntn | ||
|
|
||
| tierStr := hn.Spec.Tier | ||
| tier, err := strconv.Atoi(tierStr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, in api definition, the tier is a type of string, what if the user wants to specify tier name other than numbers? Is there any purpose in defining the type of tier as string before?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think Tier should be an int. If a "tier name" is needed, then a mapping from a "tier name" string to its "tier number" int is needed.
pkg/scheduler/cache/cache.go
Outdated
| nodeName := member.Selector.ExactMatch.Name | ||
| hntn.NodeMatchers.Insert(NewExactMatcher(nodeName)) | ||
| workerNode := sc.Nodes[nodeName] | ||
| if workerNode == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a scenario that if the HyperNode is created first and then handles the event of the HyperNode, the Node has not been created yet and is not in the cache. If we directly return here and there is no change of the HyperNode, there will no longer be new events related to the HyperNode. You can look at the logic in vc-controller. The event of resource change and the worker responsible for processing collaborate through the workqueue. If the event processing fails, the worker can throw it back to the workqueue, and then process it from the workqueue later. So we may need to add worker and workqueue here, and re-enqueue the event if it fails
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for bring up this scenario. I am aware of it. My hope is that the event handler for Node need to update the HyperNode (I did not implement that logic in the current PR. I was planning to add it later).
Your proposed solution (using workqueue) should work for Exact match. I wonder how does it work for Regex match. In case of Regex match, when will we know for sure that no future child node will appear -- hence we stop throwing it back to the workqueue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a better solution is to update HyperNodeTree when AddNode and DeleteNode, so that it doesn't matter if Node is not created yet, both ExactMatch and RegexMatch can be included. Our original intention of designing RegexMatch is actually to serve Node.
pkg/scheduler/cache/cache.go
Outdated
| case networktopov1alpha1.ExactMatchMemberSelectorType: | ||
| memberName := member.Selector.ExactMatch.Name | ||
| hntn.MemberMatchers.Insert(NewExactMatcher(memberName)) | ||
| member := sc.hypernodeForestbuilder.nodes[memberName] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same problem, the HyperNode events here are out of order, and the Member may not have been recorded in the builder. Then if the HyperNode does not change, there won't be more events coming
pkg/scheduler/cache/cache.go
Outdated
| if member != nil { | ||
| insertHyperNodeMember(hntn, hnfb, memberName, member, sc) | ||
| if hntn.Parent != nil { | ||
| sc.HyperNodes[hntn.Parent.Name].Union(sc.HyperNodes[memberName]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also has the same problem here, Union is cloned and the return value is not processed, and I didn't get the point here, what is the purpose here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also has the same problem here, Union is cloned and the return value is not processed, and I didn't get the point here, what is the purpose here?
All members of a HyperNode should also be members of its parent
pkg/scheduler/cache/cache.go
Outdated
| } | ||
| } | ||
| if hntn.Parent != nil { | ||
| sc.HyperNodes[hntn.Parent.Name].Union(sc.HyperNodes[hntn.Name]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also has the same problem here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also has the same problem here
Same answer: All members of a HyperNode should also be members of its parent
pkg/scheduler/cache/cache.go
Outdated
| if hnfb.roots.Has(hn.Name) { | ||
| hnfb.roots.Delete(hn.Name) | ||
| } | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missed update sc.HyperNodeListByTier and sc.HyperNodes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed.
pkg/scheduler/cache/cache.go
Outdated
| CSINodesStatus map[string]*schedulingapi.CSINodeStatusInfo | ||
| HyperNodeListByTier map[int]sets.Set[string] // Maps tier to a set of hypernode names | ||
| HyperNodes map[string]sets.Set[string] // Maps hypernode name to a set of node names | ||
| HyperNodeTreeRoots HyperNodeSet |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that HyperNodeTreeRoots is not used. Is it still needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. It has been replaced by hypernodeForestbuilder.roots
|
|
||
| // UpdateHyperNodeV1alpha1 update hypernode in scheduler cache | ||
| func (sc *SchedulerCache) UpdateHyperNodeV1alpha1(oldObj, newObj interface{}) { | ||
| sc.DeleteHyperNodeV1alpha1(oldObj) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we directly implement the logic of update member and update hypernode here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Direct implementation involves
- Finding diff between the old and new CRD
- Update the forest based on the updated fields, and all the parent/child relationship based on ExactMatch and RegexMatch needs to be updated.
Because of the complicated logic involved in Step2, especially RegexMatch for both its children and its parents, that piece of code will be very complicated, and it will be a lot of duplicated logic from exiting AddHyperNodeV1alpha1() and DeleteHyperNodeV1alpha1().
i think the current implementation is easier to understand and to maintain. Please let me know if you think otherwise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's great
|
Please rebase onto the new codes: #3874, and squash your commit into one, so I can review clearly |
| } | ||
|
|
||
| // insertHyperNodeMember inserts a member into a hypernode and updates the forest | ||
| func insertHyperNodeMember(hntn *HyperNodeTreeNode, hnfb *HyperNodeForestBuilder, memberName string, memberNode *HyperNodeTreeNode, sc *SchedulerCache) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no distinction between the nodes in the leaf nodes and the hyperNodes of non-leaf nodes. What we expect is that the key stored in HyperNodes is the hyperNode, and the value is a list of all leaf nodes under the hyperNode.
pkg/scheduler/cache/cache.go
Outdated
| case networktopov1alpha1.ExactMatchMemberSelectorType: | ||
| nodeName := member.Selector.ExactMatch.Name | ||
| hntn.NodeMatchers.Insert(NewExactMatcher(nodeName)) | ||
| workerNode := sc.Nodes[nodeName] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sc.Nodes should be accessed with lock.
| Parent *HyperNodeTreeNode | ||
| Members sets.Set[*HyperNodeTreeNode] | ||
| MemberMatchers sets.Set[*MemberMatcher] | ||
| Nodes sets.Set[*schedulingapi.NodeInfo] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We just need nodeName instead of NodeInfo.
| HyperNodeListByTier map[int]sets.Set[string] // Maps tier to a set of hypernode names | ||
| HyperNodes map[string]sets.Set[string] // Maps hypernode name to a set of node names | ||
| HyperNodeTreeRoots HyperNodeSet | ||
| hypernodeForestbuilder *HyperNodeForestBuilder |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should also be locked.
|
We should also expose a method like LCA(Lowest Common Ancestor) for HyperNodeTreeRoots, which will be used by score logic. |
pkg/scheduler/cache/cache.go
Outdated
| klog.V(4).Infof("Adding hypernode %s to the forest, in %d tier", hn.Name, tier) | ||
|
|
||
| // Iterate over members and add them to the forest | ||
| for _, member := range hn.Spec.Members { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be wrapped as a func, it' the logic to build a hyperNodeTreeNode, and the last thing to do is update the sc.HyperNodeListByTier and sc.HyperNodes.
And can we just use one var to express the tree?
| Parent *HyperNodeTreeNode | ||
| Members sets.Set[*HyperNodeTreeNode] | ||
| MemberMatchers sets.Set[*MemberMatcher] | ||
| Nodes sets.Set[*schedulingapi.NodeInfo] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And seems nowhere to assign leaf nodes to sc.HyperNodes.
pkg/scheduler/cache/cache.go
Outdated
| for _, parentNode := range sc.hypernodeForestbuilder.nodes { | ||
| for m := range parentNode.MemberMatchers { | ||
| if m.Exact != "" && m.Exact == hn.Name { | ||
| insertHyperNodeMember(parentNode, hnfb, hn.Name, hntn, sc) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
insertHyperNodeMember should be used as a method of *HyperNodeForestBuilder instead of passing it in through parameters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SchedulerCache should also be used as an attribute of HyperNodeForestBuilder. The function parameters must be consistent with the function name and responsibilities.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't memerNode already contain a name attribute? No need to pass again
aec453f to
5e3e22e
Compare
| }) | ||
| } | ||
|
|
||
| // TODO (penggu): create informat for HyperNode(v1alpha1) information |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO still need?
| Tier int | ||
| Parent *HyperNodeTreeNode | ||
| Members sets.Set[*HyperNodeTreeNode] | ||
| MemberMatchers sets.Set[*MemberMatcher] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we group together Members/NodeMatchers and MemberMatchers/NodeMatchers? Too many unused fields
Add AddFunc for HyperNode informer support to SchedulerCache Network topology implements of scheduler Signed-off-by: Monokaix <[email protected]> Add plugin for networkTopology and score logic Signed-off-by: Monokaix <[email protected]> WIP1 HyperNode handling in scheduler cache Change HyperNodesListByTier to a map of set (instead of string) Refactor event handling for HyperNode cache Remove unused updateHyperNodeListByTier function Add child member handling for HyperNodes based on parent matchers Move add and delete HyperNode functions to event_handlers.go Refactor HyperNode handling: unify member updates
d453afd to
4d68f2c
Compare
|
@penggu: PR needs rebase. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
This pr has some code conflict to resolve and update, and #3964 will work based on this: ) |
|
This feature has been merged via PR #3964. So it is no longer needed. OK to close this PR now. |
add event handler for adding/deleting/updating HyperNode in scheduler cache.