feat: set IgnoreDaemonSetsUtilization per nodegroup for AWS#5672
Conversation
b192159 to
7fa229d
Compare
|
@drmorr0 , @feiskyer |
IgnoreDaemonSetsUtilization per nodegroupIgnoreDaemonSetsUtilization per nodegroup for AWS
|
There are many cloud providers which have `grep -R ") GetOptions" ./cloudprovider -A 2 --exclude-dir=vendor` |
| } | ||
|
|
||
| type utilizationThresholdGetter interface { | ||
| type nodeGroupConfigGetter interface { |
There was a problem hiding this comment.
| return simulator.UnexpectedError, nil | ||
| } | ||
|
|
||
| gpuConfig := context.CloudProvider.GetNodeGpuConfig(node) |
There was a problem hiding this comment.
code from line 141 to 146 here is just copy and paste from line 121 to 126 here. I have moved it down because I want to use nodegroup when calling GetIgnoreDaemonSetsUtilization to get the per ASG value for IgnoreDaemonSetsUtilization and then pass it to utilization.Calculate on line 142 below.
|
|
||
| allTestCases := testCases | ||
|
|
||
| // run all test cases again with `IgnoreDaemonSetsUtilization` set to true |
There was a problem hiding this comment.
This is so that we test against IgnoreDaemonSetsUtilization: true and IgnoreDaemonSetsUtilization: false
There was a problem hiding this comment.
This will cause different test cases to have the same desc, making them harder to debug. For the sake of readability, I'd just have one long list of test cases to check, but if you have to generate them programatically like this, please update desc as well.
There was a problem hiding this comment.
Thank you for the suggestion.
I have started adding a suffix now like <test-case-desc> IgnoreDaemonSetsUtilization=true and <test-case-desc> IgnoreDaemonSetsUtilization=false and
|
|
||
| type staticThresholdGetter struct { | ||
| threshold float64 | ||
| type staticNodeGroupConfigProcessor struct { |
There was a problem hiding this comment.
I have replaced staticThresholdGetter here with staticNodeGroupConfigProcessor because I think it is a better implementation since we don't hard code the value of threshold to 0.5 but use the default values set here.
| // from NodeGroupConfigProcessor interface | ||
| type actuatorNodeGroupConfigGetter interface { | ||
| // GetIgnoreDaemonSetsUtilization returns IgnoreDaemonSetsUtilization value that should be used for a given NodeGroup. | ||
| GetIgnoreDaemonSetsUtilization(context *context.AutoscalingContext, nodeGroup cloudprovider.NodeGroup) (bool, error) |
There was a problem hiding this comment.
This interface is to limit the functions that can be used from processors.NodeGroupConfigProcessor interface. Since I only need to use GetIgnoreDaemonSetsUtilization here, I don't see the need to expose all the functions from the processors.NodeGroupConfigProcessor interface.
I think this PR is becoming too big. I will be supporting this tag only for AWS in this PR. |
bcbd017 to
ca74c9c
Compare
| }, | ||
| { | ||
| nodeGroup: testNg2, | ||
| testCases: getStartDeletionTestCases(testNg2), |
There was a problem hiding this comment.
I am running the test cases twice one for each nodegroup defined above where in the first nodegroup, IgnoreDaemonSetsUtilization: false and for the second nodegroup IgnoreDaemonSetsUtilization: true. This is to test IgnoreDaemonSetsUtilization option on the nodegroup which is called here which is called by StartDeletion (function we are trying to test here) via deleteAsyncDrain and deleteAsyncEmpty.
0cf8581 to
4400d80
Compare
97ce06c to
c55f018
Compare
Signed-off-by: vadasambar <[email protected]> fix: test cases failing for actuator and scaledown/eligibility - abstract default values into `config` Signed-off-by: vadasambar <[email protected]> refactor: rename global `IgnoreDaemonSetsUtilization` -> `GlobalIgnoreDaemonSetsUtilization` in code - there is no change in the flag name - rename `thresholdGetter` -> `configGetter` and tweak it to accomodate `GetIgnoreDaemonSetsUtilization` Signed-off-by: vadasambar <[email protected]> refactor: reset help text for `ignore-daemonsets-utilization` flag - because per nodegroup override is supported only for AWS ASG tags as of now Signed-off-by: vadasambar <[email protected]> docs: add info about overriding `--ignore-daemonsets-utilization` per ASG - in AWS cloud provider README Signed-off-by: vadasambar <[email protected]> refactor: use a limiting interface in actuator in place of `NodeGroupConfigProcessor` interface - to limit the functions that can be used - since we need it only for `GetIgnoreDaemonSetsUtilization` Signed-off-by: vadasambar <[email protected]> fix: tests failing for actuator - rename `staticNodeGroupConfigProcessor` -> `MockNodeGroupConfigGetter` - move `MockNodeGroupConfigGetter` to test/common so that it can be used in different tests Signed-off-by: vadasambar <[email protected]> fix: go lint errors for `MockNodeGroupConfigGetter` Signed-off-by: vadasambar <[email protected]> test: add tests for `IgnoreDaemonSetsUtilization` in cloud provider dir Signed-off-by: vadasambar <[email protected]> test: update node group config processor tests for `IgnoreDaemonSetsUtilization` Signed-off-by: vadasambar <[email protected]> test: update eligibility test cases for `IgnoreDaemonSetsUtilization` Signed-off-by: vadasambar <[email protected]> test: run actuation tests for 2 NGS - one with `IgnoreDaemonSetsUtilization`: `false` - one with `IgnoreDaemonSetsUtilization`: `true` Signed-off-by: vadasambar <[email protected]> test: add tests for `IgnoreDaemonSetsUtilization` in actuator - add helper to generate multiple ds pods dynamically - get rid of mock config processor because it is not required Signed-off-by: vadasambar <[email protected]> test: fix failing tests for actuator Signed-off-by: vadasambar <[email protected]> refactor: remove `GlobalIgnoreDaemonSetUtilization` autoscaling option - not required Signed-off-by: vadasambar <[email protected]> fix: warn message `DefaultScaleDownUnreadyTimeKey` -> `DefaultIgnoreDaemonSetsUtilizationKey` Signed-off-by: vadasambar <[email protected]> refactor: use `generateDsPods` instead of `generateDsPod` Signed-off-by: vadasambar <[email protected]> refactor: `globaIgnoreDaemonSetsUtilization` -> `ignoreDaemonSetsUtilization` Signed-off-by: vadasambar <[email protected]>
Signed-off-by: vadasambar <[email protected]>
- instead of passing all the processors (we only need `NodeGroupConfigProcessor`) Signed-off-by: vadasambar <[email protected]>
- add suffix to tests with `IgnoreDaemonSetsUtilization` set to `true` and `IgnoreDaemonSetsUtilization` set to `false` Signed-off-by: vadasambar <[email protected]>
38a9a4d to
e1a22da
Compare
|
|
||
| for _, testSet := range testSets { | ||
| for tn, tc := range testSet { | ||
| t.Run(tn, func(t *testing.T) { |
There was a problem hiding this comment.
Code from this line and below is just a copy and paste of the older code (shifted by some lines). I haven't changed anything in the code until https://github.com/kubernetes/autoscaler/pull/5672/files#r1174841347
Signed-off-by: vadasambar <[email protected]>
|
@x13n I've addressed all the review comments. Can you please review this PR again 🙏 |
x13n
left a comment
There was a problem hiding this comment.
Thanks for the changes, looks good to me!
/lgtm
| } | ||
|
|
||
| // BuildDSTestPod creates a DaemonSet pod with cpu and memory. | ||
| func BuildDSTestPod(name string, cpu int64, mem int64) *apiv1.Pod { |
There was a problem hiding this comment.
Separate PR makes a lot of sense, thanks!
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: MaciekPytel, vadasambar, x13n The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
Oh, and the hold was meant for me, so removing: /hold cancel |
|
I read #5399 first, and then the discussion here, and it's still not clear to me what problem you encounter by ignoring DaemonSet utilization globally. The description notes:
If the nodes are large enough and the contribution from DaemonSets is proportionally small, what harm comes from ignoring their contribution on these large nodes too? What is the benefit of noting their contribution there? |
If you look at really small consumption in daemonsets, I think ignoring daemonset utilization makes sense. e.g., only 1 CPU is consumed by daemonsets on a 32 CPU node. It might not matter as much. But if you look at relatively larger consumption in daemonsets, say 1CPU in a 10 CPU node. For example imagine a scenario like this:
In case of 2, we want the node to get scaled down. But in case of 1, we might not want the node to get scaled down. Ignoring daemonset utilization for nodegroup1 means CA would scale down nodes in nodegroup 1 even though the node is actually using 70% of cpu (if you include daemonset utilization). This means, the pods would need to find a new home now. This can create potential disruption (application downtime until new new node is brought in by CA) |
|
Thank you for the explanation. Regarding case one, the target utilization of 0.7 (70%) is higher than I've ever used, and I still don't see why you want to count the DaemonSet's contribution there. If you want the node to survive your 70% threshold, it either needs 10% more "real" workload, or your actual threshold should be closer to 60% to get what you want. Counting the DaemonSet as part of that utilization is falsifying the baseline atop which you sum the rest of the non-daemon workload's utilization. You could adjust the threshold instead. |
…onTime to NodeGroupAutoscalingOptions **What type of PR is this?** (bug, cleanup, documentation, feature) **What this PR does / why we need it?**: Backports kubernetes#5649 to the 1.29 branch per https://github.com/Azure/karpenter-poc/issues/673 Also partially reverts 6e34eae to better align with kubernetes#5672 **Does this PR introduce a user-facing change?** No **WorkItem** <!--- NOTE: REPLACE THIS LINE WITH THE WORKITEM SUFFIXED WITH "https://dev.azure.com/msazure/CloudNativeCompute". Currently, this CAS repo is under "https://dev.azure.com/AzureContainerUpstream", workItems are under "https://dev.azure.com/msazure/CloudNativeCompute" and ADO doesn't support cross-linking.----> **Testing** - [ ] Unit tests - [ ] E2E tests (we can custom branches on standalone env only) - [ ] Other tests (eg. manual testing). <details> <summary>Cherry-pick details</summary> **Is this a cherry-picked PR?** - [X] Yes. Include original PR. ``` kubernetes#5649 ``` - [ ] No. On which internal versions / branches is this PR getting cherry-picked ? ``` ``` **Is this PR getting merged on [upstream](https://github.com/kubernetes/autoscaler) ?** - [X] Yes. Which branches / versions including master? ``` ``` - [ ] No. Why? ``` ``` </details> <br> **Special notes for your reviewer**: **Additional Documentation**: ---- #### AI description (iteration 1) #### PR Classification Code cleanup #### PR Summary This pull request moves the `MaxNodeProvisionTime` configuration from `AutoscalingOptions` to `NodeGroupAutoscalingOptions`. - `cluster-autoscaler/config/autoscaling_options.go`: Removed `MaxNodeProvisionTime` from `AutoscalingOptions`.
What type of PR is this?
/kind feature
What this PR does / why we need it:
We want to support the ability to specify
IgnoreDaemonSetsUtilizationper nodegroup through ASG tags. Why? It is to cater for the following case (among other possible cases):IgnoreDaemonSetsUtilizationis setfalseglobally because we want to consider daemonsets when calculating node resource utilization. This works well for ASGs/nodegroups which have large nodes and daemonsets utilization doesn't contribute much to node resource utilization. Due to this, we have no problem during node scale down.ScaleDownUtilizationThreshold.To solve this problem, we want to support setting
IgnoreDaemonSetsUtilizationper ASG/nodegroup through tag likek8s.io/cluster-autoscaler/node-template/autoscaling-options/ignoredaemonsetsutilization: truewhich would override the global value ofIgnoreDaemonSetsUtilization(which is false)More details: #5399
Which issue(s) this PR fixes:
Fixes #5399
Special notes for your reviewer:
Nothing in particular.
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: