-
Notifications
You must be signed in to change notification settings - Fork 42k
Track ownership for scale subresource of deployments #95921
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Hi @nodo. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
ca5b27c to
8582d50
Compare
|
/assign @jennybuckley |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm I wonder if, instead of the helper, it's better to get the managed fields at an index deployment.GetManagedFields()[i], assert the manager name, and assert that it contains the .spec.replicas field.
But for the helper:
- this probably could be named assertReplicasOwnership instead
- at the beginning, call
t.Helper() - at the end,
if !managerSeen { t.Fatalf(...) }if.spec.replicasis not in managed fields at all for some reason
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC I think the reason we have this helper is to ensure that a new manager is owning the replicas field, but also no other manager is, right? I am not sure using deployment.GetManagedFields()[i] we could check both conditions.
As for the suggestions, totally! thanks for spotting, I will make the changes
|
Nice -- could you add/update a unit test for this in addition to the integration test? I think in storage_test.go. |
|
/ok-to-test |
|
/lgtm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I typically try to avoid having variables that only make sense in a specific context and not others. In that case, the ParentFieldManager member is always in scopes but is only used for specific sub-resources. In that case (scale sub-resource), do we actually use both fieldmanagers? Is it possible that the fieldmanager above is misconfigured?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't looked extensively, but it's possible that we actually run the FieldManager on the scale object itself before :-(.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, if I am not mistaken, it's used on the scale object itself too (e.g. https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/apiserver/pkg/endpoints/handlers/patch.go#L553)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, for nothing, but yes :D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, for nothing, but yes :D
Does that mean that even if this PR enabled tracking ownership via /scale, server-side-applying to /scale would not notify of field manager conflicts on the replicas field? Shouldn't it?
Rather than trying to run two fieldmanager passes on Scale, could we:
- translate the managedFields entry in the Deployment containing
spec.replicasto theScaleobject inscaleFromDeployment? - translate the managedFields entry in the resulting Scale object for
spec.replicasindicating the owner of the scale'sspec.replicasfield back to the Deployment inscaleUpdatedObjectInfo#UpdatedObject?
something like this:
diff --git a/pkg/registry/apps/deployment/storage/storage.go b/pkg/registry/apps/deployment/storage/storage.go
index 8bbe4690cea..ebe2b46e093 100644
--- a/pkg/registry/apps/deployment/storage/storage.go
+++ b/pkg/registry/apps/deployment/storage/storage.go
@@ -329,6 +329,26 @@ func scaleFromDeployment(deployment *apps.Deployment) (*autoscaling.Scale, error
if err != nil {
return nil, err
}
+
+ var managedFields []metav1.ManagedFieldsEntry
+ // TODO: handle different paths for different API versions
+ if managedEntry, exists := findEntry(deployment.ManagedFields, "spec", "replicas"); exists {
+ // We found an owner for the spec.replicas field.
+ // Translate that into a single v1.Scale managed fields entry
+ managedFields = []metav1.ManagedFieldsEntry{{
+ // copy fields from relevant owner
+ Manager: managedEntry,
+ Operation: managedEntry.Operation,
+ Time: managedEntry.Time,
+ // Produce a FieldsV1 managed field entry
+ FieldsType: "FieldsV1",
+ // API version of v1.Scale
+ APIVersion: "v1",
+ // field path to v1.Scale replicas
+ FieldsV1: &metav1.FieldsV1{Raw: []byte(`"f:spec:":{"f:replicas":{}}`)},
+ }}
+ }
+
return &autoscaling.Scale{
// TODO: Create a variant of ObjectMeta type that only contains the fields below.
ObjectMeta: metav1.ObjectMeta{
@@ -337,6 +357,7 @@ func scaleFromDeployment(deployment *apps.Deployment) (*autoscaling.Scale, error
UID: deployment.UID,
ResourceVersion: deployment.ResourceVersion,
CreationTimestamp: deployment.CreationTimestamp,
+ ManagedFields: managedFields,
},
Spec: autoscaling.ScaleSpec{
Replicas: deployment.Spec.Replicas,
@@ -404,5 +425,9 @@ func (i *scaleUpdatedObjectInfo) UpdatedObject(ctx context.Context, oldObj runti
// move replicas/resourceVersion fields to object and return
deployment.Spec.Replicas = scale.Spec.Replicas
deployment.ResourceVersion = scale.ResourceVersion
+
+ if managedEntry, exists := findEntry(scale.ManagedFields, "spec", "replicas"); exists {
+ // TODO: merge the Scale managed entry for spec.replicas back into the deployment
+ }
return deployment, nil
}I'm wondering if that two-way translation of the relevant managed entry would make the existing field manager work correctly as is
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, I have added the test here: 2c14088
and it works as expected.
If I understand correctly. The field manager of the main resource is invoked also when running "apply on "scale" so it triggers a conflict. Am I missing something though?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah! There's something tricky happening here. To verify that, I think you could write a test where apply will actually ALWAYS conflict on scale. When you apply an object that doesn't have a managedFields set (and that's the case of EVERY scale objects here), we create one that we call "before-first-apply" with existing fields. Here, you're getting a conflict with that manager, not the actual manager that changed the value. Every time you apply, we'll re-create this "before-first-apply" manager (even though you're supposed to own the field), and you'll get a conflict.
I don't know yet how to solve that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for spotting, that's quite a tricky edge case :)
I think a possible solution would be to not run Apply on scale objects. E.g. before this line
Add something like this:
if scale, ok := patchObject.(*autoscaling.Scale); ok {
return scale, nil
}Applying the field manager here doesn't have any effect anyway.
I tried to implement it but I got stuck because I couldn't find a way to convert an *unstructured.Unstructured to *autoscaling.Scale and I wanted to check with you first.
What do you think about this solution? If it's a good way forward, how to do the conversion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have done some more tests and I was trying to convert the Unstructured object using the following code:
scaleGVK := schema.GroupVersionKind{Group: "autoscaling", Version: "v1", Kind: "Scale"}
if p.kind == scaleGVK {
var scale autoscaling.Scale
err := runtime.DefaultUnstructuredConverter.FromUnstructured(patchObj.UnstructuredContent(), &scale)
return &scale, err
}This doesn't work for a reason I don't fully understand. The resulting scale object doesn't have any metadata, and I am getting this error:
error the name of the object (deployment based on URL) was undeterminable: name must be provided
Digging a bit deeper, I have noticed that the Scale object does not specify any JSON annotations, do you know the reason for that? Other types such as Deployment have them. Could this be a problem?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Digging a bit deeper, I have noticed that the Scale object does not specify any JSON annotations, do you know the reason for that? Other types such as Deployment have them. Could this be a problem?
You're looking at the "internal type" rather than the "external type". Internal types are not converted to json and hence don't have the tags. You should be looking at the external type (note the difference in the URL, it's probably too subtle ...): https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/api/autoscaling/v1/types.go#L114-L127
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we guaranteed that options is always non-nil here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be quite safe, options is created here, just before the Update call. Do you prefer me to add extra checks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm assuming that we need to copy in case the Update changes the object in place. I'm pretty sure it doesn't (do you remember @jpbetz?). Either way, there is no reason to copy if we don't call Update, at least move the copy in the if statement. But I don't think the copy is needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might be wrong, have you experimented with it? What made you think you needed to copy?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My assumption was that we need to pass to Update two objects: 1. the object "before" changing it and 2. the object "after" changing it so that the managed fields are updated accordingly. If we don't copy it we effectively pass the same object. I am probably missing something though.
If we need to keep the copy, and we decide we want to do it only when necessary, the resulting code is a bit convoluted. I would propose something like:
var live *apps.Deployment
if i.parentFieldManager != nil {
live = deployment.DeepCopy()
}
// move replicas/resourceVersion fields to object
deployment.Spec.Replicas = scale.Spec.Replicas
deployment.ResourceVersion = scale.ResourceVersion
if i.parentFieldManager != nil {
// update managed fields using the fieldmanager of the main resource
return i.parentFieldManager.Update(live, deployment, i.manager)
}
return deployment, nilWDYT?
|
LGTM once we understand if we need to copy the object. |
|
/lgtm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm really not a big fan of this here. Other suggestion:
- Check that there is an entry for the manager used in the scale operation,
- Check for conflicts on changes to that field
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure I fully understood.
Check that there is an entry for the manager used in the scale operation,
This is done here unless I missed something. fieldManager is the manager used in the scale operation, see for instance here.
Check for conflicts on changes to that field
This was my attempt to check for conflict, since the checks happens right after applying I thought in made sense to check it outside an helper.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I couldn't find an easy we have a way to check the that replicas field is owned by a field manager (and not others) without decoding. I am open to suggestions though.
I see your point of not checking internal structures, but we don't have unit tests for this feature and I would feel more confident having a more strict integration tests. As we discussed, we could definitely refactor this bit into a clearer framework.
WDYT?
|
/lgtm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this do the right thing when the field path of the replicas field is different between two versions of the object?
For example, given:
- a CRD with two versions defined:
- v1, with the field path for replicas located at
spec.replicasV1 - v2, with the field path for replicas located at
spec.replicasV2
- v1, with the field path for replicas located at
- a conversion webhook that converts spec.replicasV1 <-> spec.replicasV2
- a scale request to v1 scale and v2 scale
Do the managed fields get set properly for each versioned request?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would be my assumption, but I am not 100% sure. @apelisse do you know?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is in apps/deployment, so I'm assuming the question is unrelated to the change?
With regard to the question, yes, we would do the right thing. "managedFields" are tracked per version, and we conflict when the value changes, which we can detect by converting the object to the version of the managedFields.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, sorry... the question was about what would happen if we extrapolated this approach to the scale subresource for a CRD
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
plumbing via request-scoped context is not what I expected here. is there a reason the storage that needs the field manager cannot get it more directly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not personally the most knowledgeable about wiring in the apiserver in general, but I had the same reaction when I saw that change. I've tried to find a better way and couldn't find one. That, of course, doesn't mean there isn't one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, for nothing, but yes :D
Does that mean that even if this PR enabled tracking ownership via /scale, server-side-applying to /scale would not notify of field manager conflicts on the replicas field? Shouldn't it?
Rather than trying to run two fieldmanager passes on Scale, could we:
- translate the managedFields entry in the Deployment containing
spec.replicasto theScaleobject inscaleFromDeployment? - translate the managedFields entry in the resulting Scale object for
spec.replicasindicating the owner of the scale'sspec.replicasfield back to the Deployment inscaleUpdatedObjectInfo#UpdatedObject?
something like this:
diff --git a/pkg/registry/apps/deployment/storage/storage.go b/pkg/registry/apps/deployment/storage/storage.go
index 8bbe4690cea..ebe2b46e093 100644
--- a/pkg/registry/apps/deployment/storage/storage.go
+++ b/pkg/registry/apps/deployment/storage/storage.go
@@ -329,6 +329,26 @@ func scaleFromDeployment(deployment *apps.Deployment) (*autoscaling.Scale, error
if err != nil {
return nil, err
}
+
+ var managedFields []metav1.ManagedFieldsEntry
+ // TODO: handle different paths for different API versions
+ if managedEntry, exists := findEntry(deployment.ManagedFields, "spec", "replicas"); exists {
+ // We found an owner for the spec.replicas field.
+ // Translate that into a single v1.Scale managed fields entry
+ managedFields = []metav1.ManagedFieldsEntry{{
+ // copy fields from relevant owner
+ Manager: managedEntry,
+ Operation: managedEntry.Operation,
+ Time: managedEntry.Time,
+ // Produce a FieldsV1 managed field entry
+ FieldsType: "FieldsV1",
+ // API version of v1.Scale
+ APIVersion: "v1",
+ // field path to v1.Scale replicas
+ FieldsV1: &metav1.FieldsV1{Raw: []byte(`"f:spec:":{"f:replicas":{}}`)},
+ }}
+ }
+
return &autoscaling.Scale{
// TODO: Create a variant of ObjectMeta type that only contains the fields below.
ObjectMeta: metav1.ObjectMeta{
@@ -337,6 +357,7 @@ func scaleFromDeployment(deployment *apps.Deployment) (*autoscaling.Scale, error
UID: deployment.UID,
ResourceVersion: deployment.ResourceVersion,
CreationTimestamp: deployment.CreationTimestamp,
+ ManagedFields: managedFields,
},
Spec: autoscaling.ScaleSpec{
Replicas: deployment.Spec.Replicas,
@@ -404,5 +425,9 @@ func (i *scaleUpdatedObjectInfo) UpdatedObject(ctx context.Context, oldObj runti
// move replicas/resourceVersion fields to object and return
deployment.Spec.Replicas = scale.Spec.Replicas
deployment.ResourceVersion = scale.ResourceVersion
+
+ if managedEntry, exists := findEntry(scale.ManagedFields, "spec", "replicas"); exists {
+ // TODO: merge the Scale managed entry for spec.replicas back into the deployment
+ }
return deployment, nil
}I'm wondering if that two-way translation of the relevant managed entry would make the existing field manager work correctly as is
|
Thanks a lot @liggitt for review! 🙏
Field manager will be notified on conflicts (see the integration test here). It just runs the logic of the deployment field manager.
The problem with this solution is that we would need to merge back the That's why at the end this solution seemed more straightforward. |
|
New changes are detected. LGTM label has been removed. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: apelisse, nodo The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Does that conflict error come from here: Do |
|
@liggitt you raised a very good points in your latest review. The conflict are not raised properly when applying a |
1. It produces a conflict when .spec.replicas is different from the main resource 2. It updates the deployment when forcing it
216e2db to
b7f3ad6
Compare
|
@nodo: The following tests failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
|
Closing in favour of #98377 |
What type of PR is this?
/kind bug
What this PR does / why we need it:
Track ownership for scale subresource of deployments. This is achieved using the main resource field manager.
Which issue(s) this PR fixes:
Part of #82046
Special notes for your reviewer:
Passing the main resource field manager using context is not great, however it could be an acceptable compromise given we might refactor this logic in #84530.
This PR uses some pieces of code and suggestions from #83294 and #83444.
Does this PR introduce a user-facing change?: