[KEP-4815] Update KEP for 1.35#5515
Conversation
mortent
commented
Sep 5, 2025
- One-line PR description: Update the design to simplify validation while preserving flexibility.
- Issue link: DRA: Partitionable Devices #4815
- Other comments:
901e51e to
8443d6e
Compare
8443d6e to
b89af85
Compare
|
/assign @johnbelamaric @pohly @klueska I have updated the KEP to reflect the idea that |
pohly
left a comment
There was a problem hiding this comment.
I like the approach, but we need to sort out some details.
b83d5be to
53a2197
Compare
pohly
left a comment
There was a problem hiding this comment.
Some nits, but overall LGTM.
|
@liggitt can you render an opinion here on the API discussions? |
|
/approve |
johnbelamaric
left a comment
There was a problem hiding this comment.
From a PRR perspective, can you describe in the feature enablement / disablement, and the rollout, upgrade, and rollback planning section what happens to resources using the old alpha fields? We are now alpha fields in v1 types. We do NOT need to support old alpha fields, nor do we need to support upgrade of alpha resources. However, we should document clearly what happens if people have existing objects with those fields set.
@johnbelamaric I have updated the PR to include more details about this. |
| An invalid resource pool will not prevent a pod from getting scheduled if devices | ||
| can be allocated from valid resource pools. For requests with `AllocationMode: All`, | ||
| this means that those can not be satisfied if any of the devices available on the node | ||
| is in an invalid resource pool. Only if a pod can't be scheduled on any of the nodes in | ||
| the cluster _and_ an invalid resource pool were encountered on at least one of the nodes, | ||
| will the system return an error. This error will be surfaced as a scheduling error that | ||
| will be added in the `Pod` status. |
There was a problem hiding this comment.
How could anyone notice that they have a misconfiguration here? Will there be any metric or other way of checking that?
There was a problem hiding this comment.
So they would notice when they try to schedule a pod, but the scheduler is unable to satisfy all claims and it encounters one or more invalid resource pools. So if there are sufficient devices available, it is possible that this might go unnoticed for some time, although arguably it doesn't really have any impact.
In could add a metric from the allocator that expose the number of invalid resource pool encountered, but it has a few challenges:
- The allocator is invoked once per pod-node combination, so we will have to figure out a way to report the metric once per pod I think.
- The allocator tries to do as little work as possible per invocation, so it discards resource pools that aren't relevant to allocating a pod on a node. As a result, it the number of resource pools identified might differ based on the pod and node.
I think the best option is the "Future options" proposal in the KEP: https://github.com/kubernetes/enhancements/pull/5515/files#diff-8985915ed27658e0b602c60b13c47d5d7e0309f9dff42df6cc9d514da47d652fR660. Such a controller could update status on ResourceSlice objects and/or expose a metric with the number of invalid resource pools in the cluster.
There was a problem hiding this comment.
However, misconfiguration may be hidden, and the entire scheduling will work differently than expected. It would be good to be able to inform the admin (?) about this. This is not a blocker, as KEP remains in the alpha stage, but it's worth reconsidering this aspect before moving to the beta.
There was a problem hiding this comment.
I agree that an error in a resource pool might lead to an unexpected set of devices being allocated to a pod since the expected pods might be in an invalid resource pool. But that should be ok since DRA will only allocate devices that meets the requirements in the ResourceClaim(s), so whatever is allocated should be usable.
I've updated the "Future options" section in the doc to also include metrics as a possible way to surface information about invalid resource pools.
|
updates to the API docs / validation lgtm |
johnbelamaric
left a comment
There was a problem hiding this comment.
small request to clarify some language
|
@macsko Any other concerns about this KEP from SIG-Scheduling? |
|
Latest changes LGTM |
|
/approve for SIG scheduling |
2074be5 to
2a323de
Compare
|
/lgtm |
|
hold if from @johnbelamaric seems to be resolved |
|
/approve /hold cancel |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: dom4ha, johnbelamaric, mortent, pohly The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |