-
Notifications
You must be signed in to change notification settings - Fork 5.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
controller history #594
controller history #594
Conversation
@kubernetes/sig-api-machinery-api-reviews |
@kow3ns: These labels do not exist in this repository: In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
## Abstract | ||
In Kubernetes, in order to update and rollback the configuration and binary | ||
images of controller managed Pods, users mutate DaemonSet, StatefulSet, | ||
and ReplicaSet (via Deployment) Objects, and the corresponding controllers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't mutate ReplicaSets[*], we create new copies when the Deployment is mutated.
[*] apart from syncing annotations, labels, and spec.MinReadySeconds between the two objects
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. This is not clear. You mutate Deployments which causes replicaset's to be created, deleted, and updated. It is a more complicated workflow. I will update to reflect this.
of versions of an Object to reconstruct the Object's history. | ||
1. History respects causality. The Object type used to store point in time | ||
snapshots must be strictly ordered with respect to creation. CreationTimestamp | ||
should not be used, as this is susceptible to clock skew. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is clock skew between multiple masters in this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Multiple API servers is my concern yes.
- This may not necessarily be all fields of Object. | ||
- The target Object state is the subset of the target state necessary to create | ||
a Objects of the generated type(s). | ||
- To make the this concrete, for the StatefulSet controller: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo
### Version Tracking | ||
Controllers must track the version of the target Object state that corresponds | ||
to their generated Objects. This information is necessary to determine which | ||
versions are live, and to track which Objects need to be updated during a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean by "and to track which Objects need to be updated during a target state update or rollback"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
History objects are immutable, pods are created or deleted, parent objects are updated by users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I have 10 Pods, and five of them are at a semantically equivalent version to my current target I only want to restart the 5 that are not. The other 5 may need to be relabeled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think my questions is answered, you mean revision not apiversion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, clearer now.
will require testing each Object for membership in the `OwnerReferences` | ||
of all AnyStates. | ||
|
||
At the cost of the complexity of implementing both labeling and ownership, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ownership is based on labeling. If an owner cannot match a child, the child is orphaned. If an owner matches an object that is not a child, the object is adopted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It has special semantics and, afiak, you can't select based on it can you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current way our controllers work is:
- select all the generated objects in the namespace for a parent: set A
- from set A, select all objects that match the parent selector: set B
- from set B, filter out all objects with an owner ref other than the current parent: set C
- from set C, adopt all objects that have no owner ref
- from set A, release all objects whose owner ref matches the parent but their labels don't match the parent selector anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been putting some more thought into this. I did not consider adoption initially because I was coming from the viewpoint that controller history is not adoptable. I still have not convinced myself that it makes sense to adopt the history of another object, but I haven't convinced myself that this is an aweful idea either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renaming is one use case, handing off to a different workload controller type is another (eg. moving away from Deployments into a TPR).
1. The controller may deserialize the values of the AnyStates representing their | ||
target Object state and perform a deep, semantic equality test. Here all | ||
differences that do not constitute a mutation to the target Object state are | ||
disregarded during the equivalence test. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here all differences that do not constitute a mutation to the target Object state are disregarded during the equivalence test.
How is this going to be done? Or are we purely based on defaulting filling out any new fields that may have been introduced since we serialized the template?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Deserialize each object and write a long, Java style, object equivalence comparison. Every time a new field gets added, update the imperative code that performs the comparison. As indicated below, this might get us off the ground but should probably not be the long term solution. Strategic merge patch is the way to go in the long run.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Every time a new field gets added, update the imperative code that performs the comparison.
Oh, it's unlike that this will be maintainable. It's going to skew at some point. Maybe a reflective test can help us but I am not sure why we want this over a simple deep equality check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Deep equality will return false positives that can cause unintentional upgrades. My largest concern is that a new field with a default value causes the a rolling update for all StatefulSet's and DaemonSet's during/after a Master upgrade. As long as rolling update works well it shouldn't actually result in a loss of availability, but it could be the worst kind of surprising.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a new field with a default value causes the a rolling update for all StatefulSet's and DaemonSet's during/after a Master upgrade.
Why is that the case? Wouldn't the new default affect both the DaemonSet/StatefulSet and AnyState templates?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Depends on how the state is stored in the CR. After talking with Daniel its probably not as likely to occur as I originally thought. Default values on PodTemplateSpec doesn't really mean anything anyway. If we need defaults we're better of to add them to the PodSpec. Defaulting a specification object really doesn't make much sense. We should add defaults to the object that models an actual thing (i.e. Pod).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I don't have a particular preference on the matter but default values today won't cause extra rollouts (because both the StatefulSet and the CR will get them). If we ever change that, then I agree that defaulting should happen only for Pods.
We propose the following command. | ||
|
||
```bash | ||
> kubectl history <controller-kind> <controller-name> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is already a kubectl rollout history
command.
extension of the [above](#viewing-history ). | ||
|
||
```bash | ||
> kubectl rollback <controller-kind> <controller-name> --revision <any-state-name> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already have a kubectl rollout undo
command.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and kubectl rollout undo --to-revision
We can use our [hash function](#hashsing) and | ||
[collision resolution](#collision-resolution) scheme to generate a system | ||
wide unique identifier for an Object based on a deterministic non-unique prefix | ||
and a serialized representation of the Object. Kuberentes Object's `.Names` must |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean .Name
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but it is the plural possessive of the field.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe reword for clarity? Something like:
Values in the .Name
field of Kubernetes Objects must conform to ...
conform to a DNS subdomain. Therefore, the total length of the unique identifier | ||
must not exceed 255, and in practice 253, characters. We can generate a unique | ||
identifier that meets this constraint by selecting a hash function such that the | ||
output length is equal to `253-len(prefix)` and applying our [hash](#hashing) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is handling names for history resources. In the case of Deployments, the name of the history object (ReplicaSet) is used in generating the name of the Pods. So I guess for Deployments, this needs to differ slightly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that Deployments should use the same concept where possible to deal with ReplicaSets, but, in this case, and in others, they will likely need special handling. Also, this is just to demonstrate that the approach is plausible, we will need to think about the naming implementation, and it will have to be based on the new hash function that will be used for deployment.
Objects along with its status. | ||
- For ReplicaSet, DaemonSet, and StatefulSet, the current state of the | ||
corresponding controllers can be derived from Pods they contain and the | ||
ReplicasSetStatus, DaemonSetStatus, and StatefulSetStatus objects |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For history, we do care about the status fields ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only is so much as the current current and update revisions are stored in the Status Kind for the object.
// Once an AnyState has been successfully created, it can not be updated. The | ||
// API Server will fail validation of all update requests that attempt to mutate | ||
// the Data field. AnyStates may, however, be deleted. | ||
type AnyState struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you intend to use it for purposes other than history ? AnyState is too generic a name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have other purpose right now, but the object is named after what it is. Its a generic, immutable snapshot of state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not in favor of this, as it's too broad. We already have ConfigMap for "any data". Creating configmap with another name seems wrong.
[equivalence](#version-equivalence) with all other versions in the Object's | ||
revision history. | ||
- If the current version is semantically equivalent to its immediate | ||
predecessor no update to the Object's target state has been performed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how do you determine the immediate predecessor ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorting based on the Generation.
|
||
1. The controller will [create a snapshot](#version-snapshot-creation) of the | ||
current target Object state. | ||
1. The controller will [reconstruct the history](#history-reconstruction) of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this history reconstruction will not always happen , only when there exists no history right ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, you will reconstruct the history during every execution of the main control loop. In practice controllers benefit from a shared cache that mediates interaction with API server.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kow3ns maybe its just wording, but reconstruction means fixing a corrupted history or creating one from scratch. For every loop of the controller, we would just add a new history if the spec changed, isnt it ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reconstructing means to reassemble the discreet snapshots into and ordered set as described in the referenced subsection
- If the current version is not equivalent to any prior version, this | ||
indicates an update or a roll forward. | ||
- Controllers should use their status objects for book keeping with respect | ||
to current and prior revisions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean revision numbers should be kept in status fields ? Currently revisions are annotations in deployments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes
names. | ||
- e.g `PodTemplateSpecs` should be stored at the `"Template"` key. | ||
1. The controller will store the specification type Objects `.Generation` at the | ||
`"ControllerGeneration"` key. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you proposing we use the generation number of the objects as revision number and their monotonically increasing sequence as a way to determine the point in time in history . Just clarifying.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know why we would go so generic here. If we have to define a schema on Data
, we might as well specialize the object and define the schema on that instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea of the ConfigMap like Object is avoid assumptions about the state data that third parties and operators might use. We will use Template and VolumeClaimsTemplate. An Operator might use an arbitrary field. Even tracking Generation as a well defined type won't work at this point for TPRs. We might want to consider going with a Revision field though. Controllers, that have a working Generation can use this. Deployments, Operators, TPC, can use the incremental Revision technique that Deployment currently uses.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can see the argument, but still uncomfortable with the idea of an api object that looks a lot like config map that is explicitly described as "generic key value store". There has to be a really strong reason to go that generic. Also, map vs array comes into play here - I'd rather an array with more structured data about each bit of state, especially if these correlate to actual fields.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With an array, how would we map back to the origin field?
The strong reason for this construction is that you can use the JSON encoded version of the Object to merge patch a delta to an arbitrary other Object. You can use it to save a delta that can be reapplied later, and it will work for an arbitrary Controller-like Object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Array is:
Templates []History
type History struct {
Purpose/Name string // i.e. "podTemplate"
TypeMeta // the api and version of the historical snippet for use by the client
Value runtime.RawExtension // the snippet
}
Purpose/Name would map back (same as the key in the map).
Another point - a pod template is serializable but a volume claim template is not (no type metadata). So there has to be additional versioning data attached if you want to store things that are not top level objects . You could alternatively require the controller to store a full object PersistentVolumeClaim
instead of VolumeClaimTemplate
and reconstitute the sub object from the full, or you could create a new VolumeClaimTemplate
and attach object metadata (which is not used?). Ideally we would only serialize exactly what we need, but we would need to specify a group and version for the data we specify.
If serialized state can span two api groups/versions, we need to have distinct group versions. If the state must be intrinsic, the controller revision may store a single type/version (and arbitrary snippets that it can understand for that version). I kind of prefer the latter, which means you can serialize subsets of the object.
In proto serialization, do you want to store JSON?
To reconstruct the history of a specification type Object, a controller will do | ||
the following. | ||
|
||
1. Select all AnyState Objects labeled as described |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are these namespaced , sorry i missed that part , if you already discussed that earlier
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes
1. Sort the AnyStates by the value associated with the | ||
`"ControllerGeneration"` key. | ||
1. This produces a strictly ordered set of AnyStates that comprises the ordered | ||
revision history of the specification type Object. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it would be good to clarify, at what point/cases is the history construction necessary
equal to the configured `RevisionHistoryLimit`. | ||
|
||
### Version Tracking | ||
Controllers must track the version of the target Object state that corresponds |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By Version of the target, you mean revision of the target or APIVersion ?
record all Objects that are generated from target Object state version | ||
represented by the AnyState as its owners. | ||
- A revision is considered to be live while any generated Object that owns | ||
it is live. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think what we are saying here is that amongst all AnyState objects for lets say StatefulSets, only the live one have ownership set the StatefulSet ? In terms of ownership , i think all AnyState which represent history should have their ownerreference pointing to the object whose history they represent. This would help in gc'ng history when the object is deleted as well
controllers that implement one method are not bound to use the same method | ||
in the future. | ||
|
||
### Target Object State Reconciliation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you clarify this more ? Is this a new step needed to maintain the history ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to re lable your product to match the specification that it is equivalent to, update the product in place, or recall and redistribute the product.
Object state. | ||
1. Controllers can reconstruct their revision history. | ||
1. Controllers can't update an AnyState's `.Data`. | ||
1. Controllers can delete an AnyState to maintain their history with respect to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can a user delete the history using kubectl ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we set the revisionhistorylimit as 0 and hence the controllers dont generate AnyState at all ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we set the revisionhistorylimit as 0 and hence the controllers dont generate AnyState at all ?
revisionhistorylimit=0 doesn't mean that the controllers are not generating history, eg. a DaemonSet in the middle of a rollout would need both the old and the new history object. revisionHistoryLimit should take effect only for history objects with no links (no running pods in the system generated by them).
@kow3ns few other questions: what chnages are needed to consolidate with deployment revisions that are currently stored in ReplicaSets. Will they coexist ? Can you manually delete history ? Will pods created manually (without using rs) also have history ? |
I don't think we are going to make any changes in Deployments in the short term. I see no issue with coexistence since this will be initially implemented for DaemonSets and StatefulSets.
You should be able but you shouldn't do it. You probably want to use revisionHistoryLimit.
I am not sure we are going to add this for ReplicaSets since they don't support upgrades although I could imagine squashing ReplicaSets into Deployments in the future (after careful evaluation of the option). I don't think we are going to use history for bare pods. |
## Requirements | ||
|
||
1. History is a collection of points in time, and each point in time must be | ||
represented by its own Object. While it is tempting to aggregate all of an |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Event if the points in time are the same we will have separate Objects for them? I'm talking about situation, when changing image: nginx -> httpd -> nginx
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, in such a case the old object is reused.
serialized representation of the Object's data. The unique hash and integer can | ||
be combined to produce a unique suffix for the Object's `.Name`. | ||
|
||
1. We must also ensure that unique name does contain any bad words. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hope you meant "doesn't"? :)
The API Server must support the creation and deletion of ControllerRevision | ||
objects. As we have no mechanism for declarative immutability, the API server | ||
must fail any update request that updates the `.Data` field of a | ||
ControllerRevision Object. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you want, you can mention that by making it immutable now, in the future we can relax immutability when that construct is introduced.
Object, a controller will do the following. | ||
|
||
1. The controller will serialize all the Object's target object state and store | ||
the serialized representation in the ControllerRevision's `.Data`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you want to call out specifically that we will trim / reduce / remove data that is not related to state (i.e. clear metadata fields, clear status, etc)? Would be good to explicitly call out what you want to persist here (I assume it's statefulset - metadata - status - other fields that don't clearly map) and maybe provide a concrete example in the section below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left this general on purpose so that client code, StatefulSet controller, DaemonSet controller, operators, etc, is not constrained. There are examples of what the target Object state is above, (i.e it is called out that for ReplicaSet, DaemonSet, and StatefulSet this include the PodTemplateSpec, and that this includes the VolumeClaimsTemplate for StatefulSet). I'd prefer to defer the specifics of capturing the state to the StatefulSet update proposal and DaemonSet update proposal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok
Generally LGTM, some structural things and I'll defer to others on the details of hash / label / collisions. |
// Data contains the serialized state. | ||
Data runtime.RawExtension | ||
// Revision indicates the revision of the state represented by Data. | ||
Revision int64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Data and Revision should be inside a Spec block, probably.
Revision is going to be a confusing field given that metadata already has both ResourceVersion and Generation. :/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hrm - I'm not sure they should be a spec block, unless they have status. This is closer to a secret, config map, or user.
Same comment as Clayton
…On Mon, May 15, 2017 at 2:04 PM, Clayton Coleman ***@***.***> wrote:
Generally LGTM, some structural things and I'll defer to others on the
details of hash / label / collisions.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#594 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AHuudqPRKOIIzRG_wKCBj65Njd22wdW6ks5r6L3DgaJpZM4NOrH1>
.
|
1. The controller will serialize all the Object's target object state and store | ||
the serialized representation in the ControllerRevision's `.Data`. | ||
1. The controller will store a unique, monotonically increasing | ||
[revision number](#revision-number-selection) in the Revision field. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this be the same as the controller's metadata.generation? If not, why not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generation is recorded by the api server, revision is a generation materialized by a controller.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see this is answered below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
revision is a generation materialized by a controller.
But not the actual generation number. Revision essentially tracks observed pod template updates.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Kargakis could you help understand. I earlier thought revision was populated from generation of the object. Any time an object changes spec, apiserver increments its generation, which sounds exactly like revision of a spec.
Does a revision have status? It's generally ok not to have spec *if* you
don't plan on status, or your object isn't materialized on the cluster.
Could we make the field Revision be "ObservedGeneration" or similar? Since
generation is on ObjectMeta i don't feel terrible about reusing it.
…On Mon, May 15, 2017 at 5:51 PM, Daniel Smith ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In contributors/design-proposals/controller_history.md
<#594 (comment)>:
> +
+``` golang
+// ControllerRevision implements an immutable snapshot of state data. Clients
+// are responsible for serializing and deserializing the objects that contain
+// their internal state.
+// Once a ControllerRevision has been successfully created, it can not be updated.
+// The API Server will fail validation of all requests that attempt to mutate
+// the Data field. ControllerRevisions may, however, be deleted.
+type ControllerRevision struct {
+ metav1.TypeMeta
+ // +optional
+ metav1.ObjectMeta
+ // Data contains the serialized state.
+ Data runtime.RawExtension
+ // Revision indicates the revision of the state represented by Data.
+ Revision int64
Data and Revision should be inside a Spec block, probably.
Revision is going to be a confusing field given that metadata already has
both ResourceVersion and Generation. :/
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#594 (review)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABG_p64B7OFIzF7-erqY8hGOrkGmp6f6ks5r6MjSgaJpZM4NOrH1>
.
|
Revision is updated on observed pod template updates. ObservedGeneration as we use it everywhere today is for tracking spec changes so by doing this we are conflating two different things. |
@lavalamp I would think ControllerRevision's would have neither a .Spec nor a .Status. They represent a snapshot and not a specification that the controller is attempting to realize. |
the current state of the system to the new declared target state. | ||
|
||
To facilitate update and rollback for these controllers, and to provide a | ||
primitive that third party controllers can build on, we propose a mechanism |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kow3ns are we proposing that this revision tracking can be used in TPR based operators to track TPR Versions ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Eventually
// Data contains the serialized state. | ||
Data runtime.RawExtension | ||
// Revision indicates the revision of the state represented by Data. | ||
Revision int64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kow3ns where is the metadata to tell if its StatefulSets history or DaemonSet's ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can store the metadata inside the object. In general ControllerRefs and labeling should prevent overlap.
|
||
1. The controller will [create a snapshot](#version-snapshot-creation) of the | ||
current target Object state. | ||
1. The controller will [reconstruct the history](#history-reconstruction) of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kow3ns maybe its just wording, but reconstruction means fixing a corrupted history or creating one from scratch. For every loop of the controller, we would just add a new history if the spec changed, isnt it ?
1. The controller will serialize all the Object's target object state and store | ||
the serialized representation in the ControllerRevision's `.Data`. | ||
1. The controller will store a unique, monotonically increasing | ||
[revision number](#revision-number-selection) in the Revision field. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Kargakis could you help understand. I earlier thought revision was populated from generation of the object. Any time an object changes spec, apiserver increments its generation, which sounds exactly like revision of a spec.
|
||
1. Select all ControllerRevision Objects labeled as described | ||
[above](#version-snapshot-creation). | ||
1. Filter any ControllerRevisions that do not have a ControllerRef in their |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: you probably mean filter out any ControllerRevisions
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
filter, as a verb, is used correctly here
|
||
At the cost of the complexity of implementing both labeling and ownership, | ||
controllers may use a combination of both approaches to mitigate the | ||
deficiencies of each. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kow3ns do we need both. why not just start with one and see if we need the other as well ? I feel like ownerReferences gives you garbage collection as well. So we can ignore the labelling of the owning objects until we find a good reason to do it ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to the owner reference to ensure that overlap does not occur. We have been bitten in the past by not ensuring the providence of generated objects.
The controller should deserialize the values of the ControllerRevisions | ||
representing their target Object state and perform a deep, semantic equality | ||
test. Here all differences that do not constitute a mutation to the target | ||
Object state are |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kow3ns may be mention that we would do this deep semantic equality only when the hash is same
1. The controller will compute the [hash](#hashing) of the | ||
ControllerRevision's `.Data`. | ||
1. The controller will attach a label to the ControllerRevision so that it is | ||
selectable with a low probability of overlap. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kow3ns still cant make out from the sentence, but are we storing the hash of the .Data inside the ControllerRevision or not. I think that will be useful
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. As per previous SIG discussion the hash is used to name the object.
necessary to determine if the current target Object state is equivalent to a | ||
prior version. | ||
|
||
Since we [track the version of](#version-tracking) of generated Objects, this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lot of places , you use generated objects
, do you mean the Revision Object or the StatefulSet/DaemonSet ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pods. The terminology section makes this explicit.
1. Controllers can reconstruct their revision history. | ||
1. Controllers can't update a ControllerRevision's `.Data`. | ||
1. Controllers can delete a ControllerRevision to maintain their history with | ||
respect to the configured `RevisionHistoryLimit`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would we allow creating ControllerRevision in apiserver directly ? We should likely not allow that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the object is namespaced the default RBAC policies should prevent one user from clobbering the revisions created by another, unless they share the same namespace. If more fine grained control is needed RBAC can be adjusted to accommodate this.
- This may not necessarily be all fields of Object. | ||
- The target Object state is the subset of the target state necessary to create | ||
Objects of the generated type(s). | ||
- To make the this concrete, for the StatefulSet controller: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo
Had some final comments, lgtm overall |
`.Spec.Template` and `.Spec.VolumeClaims`. | ||
- The target Object state is the subset of the target state necessary to create | ||
Objects of the generated type(s). | ||
- To make the this concrete, for the StatefulSet controller `.Spec.Template` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: s/To make the this concrete/To make this concrete/
Objects along with its status. | ||
- For ReplicaSet, DaemonSet, and StatefulSet, the current state of the | ||
corresponding controllers can be derived from Pods they contain and the | ||
ReplicasSetStatus, DaemonSetStatus, and StatefulSetStatus objects |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the current state of a StatefulSet controller also include the PersistentVolumeClaims generated by the controller? Or can that always be derived from its pods?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The target Object state for StatefulSet contains the .Spec.VolumeClaimsTemplate as well. Though these are currently immutable, and the semantics of mutation are not well defined atm.
This section is presented as a generalization of how an arbitrary controller | ||
can use ControllerRevision to persist a history of revisions to its | ||
specification type Objects. The technique is applicable, without loss of | ||
generality, to the existing Kuberentes controllers that have Pod as a generated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: typo s/Kuberentes/Kubernetes/
We can use our [hash function](#hashsing) and | ||
[collision resolution](#collision-resolution) scheme to generate a system | ||
wide unique identifier for an Object based on a deterministic non-unique prefix | ||
and a serialized representation of the Object. Kuberentes Object's `.Names` must |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: s/Kuberentes/Kubernetes/
We can use our [hash function](#hashsing) and | ||
[collision resolution](#collision-resolution) scheme to generate a system | ||
wide unique identifier for an Object based on a deterministic non-unique prefix | ||
and a serialized representation of the Object. Kuberentes Object's `.Names` must |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe reword for clarity? Something like:
Values in the .Name
field of Kubernetes Objects must conform to ...
- A revision is considered to be live while any generated Object labeled | ||
with its `.Name` is live. | ||
- This method has the benefit of providing visibility, via the label, to | ||
users with respect to the historical providence of a generated Object. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think instead of providence you mean provenance? Used a couple of other places here as well.
and deciding if a generated Object corresponds to a particular revision | ||
will require testing each Object for membership in the `OwnerReferences` | ||
of all ControllerRevisions. | ||
1. Note that, since we are labeling the generated Objects to indicate their |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this belong as a third bullet point? I guess it applies to both 1 (labeling) and 2 (owner references). The introductory paragraph for this section refers to two methods, so having this here as a third bullet point seems odd. Maybe move it to a note at the bottom of the section?
restart of every Pod in every StatefulSet and DaemonSet in the system when | ||
additions are made to PodTemplateSpec during a master upgrade. It is therefore | ||
necessary to determine if the current target Object state is equivalent to a | ||
prior version. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case, you care whether the target Object state is equivalent to the current version, not any prior version, correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current version, in this context, is the state extracted from the specification object. All other state is prior.
necessary to determine if the current target Object state is equivalent to a | ||
prior version. | ||
|
||
Since we [track the version of](#version-tracking) of generated Objects, this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might have missed it, but I didn't see a mention of the behavior when the the controller revision is missing for a generated object -- e.g., after an upgrade from a k8s version that didn't support history.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is deferred to the individual proposals for history pertaining to each controller. This behavior is likely not going to be uniform across all controller implementations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case, I'd suggest including a note to that effect somewhere within this proposal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added a section
- If the current version is equivalent to a version prior to its immediate | ||
predecessor, this indicates a rollback. | ||
- If the current version is not equivalent to any prior version, this | ||
indicates an update or a roll forward. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this (determination of a rollback or an update) just informational, or is the controller expected to treat the two cases differently?
Also, detecting rollback is limited to the controller's RevisionHistoryLimit
, correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
StatefulSet cares about rollback vs rollforward. Deployment and DaemonSet probably do not.
This is really close to LGTM. Can reviewers drive any additional questions and resolution in by EOD so we can merge? |
continue to (re)create generated Objects based on the current target Object | ||
state. | ||
1. The history should be initialized on the first mutation to the specification | ||
type Object for which the history will be generated. | ||
1. After the history has been initialized, any generated Objects that have no | ||
indication of the revision from which the were generated may be treated as if | ||
indication of the revision from which they were generated may be treated as if | ||
the have a nil revision. That is, without respect to the method of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: "as if they have a nil revision."
Thanks for the updates and responses! Lgtm |
LGTM as well. Please squash and we'll merge. |
LGTM |
and StatefulSet update features.
Automatic merge from submit-queue Implement Daemonset history ~Depends on #45867 (the 1st commit, ignore it when reviewing)~ (already merged) Ref kubernetes/community#527 and kubernetes/community#594 @kubernetes/sig-apps-api-reviews @kubernetes/sig-apps-pr-reviews @erictune @kow3ns @lukaszo @Kargakis --- TODOs: - [x] API changes - [x] (maybe) Remove rollback subresource if we decide to do client-side rollback - [x] deployment controller - [x] controller revision - [x] owner ref (claim & adoption) - [x] history reconstruct (put revision number, hash collision avoidance) - [x] de-dup history and relabel pods - [x] compare ds template with history - [x] hash labels (put it in controller revision, pods, and maybe deployment) - [x] clean up old history - [x] Rename status.uniquifier when we reach consensus in #44774 - [x] e2e tests - [x] unit tests - [x] daemoncontroller_test.go - [x] update_test.go - [x] ~(maybe) storage_test.go // if we do server side rollback~ kubectl part is in #46144 --- **Release note**: ```release-note ```
Automatic merge from submit-queue Implement Daemonset history ~Depends on #45867 (the 1st commit, ignore it when reviewing)~ (already merged) Ref kubernetes/community#527 and kubernetes/community#594 @kubernetes/sig-apps-api-reviews @kubernetes/sig-apps-pr-reviews @erictune @kow3ns @lukaszo @Kargakis --- TODOs: - [x] API changes - [x] (maybe) Remove rollback subresource if we decide to do client-side rollback - [x] deployment controller - [x] controller revision - [x] owner ref (claim & adoption) - [x] history reconstruct (put revision number, hash collision avoidance) - [x] de-dup history and relabel pods - [x] compare ds template with history - [x] hash labels (put it in controller revision, pods, and maybe deployment) - [x] clean up old history - [x] Rename status.uniquifier when we reach consensus in #44774 - [x] e2e tests - [x] unit tests - [x] daemoncontroller_test.go - [x] update_test.go - [x] ~(maybe) storage_test.go // if we do server side rollback~ kubectl part is in #46144 --- **Release note**: ```release-note ```
This reverts commit d42823d05e8a82c4cac50060c95048535702ac79.
Controller history proposal as per SIG Apps discussion.
This contribution is meant as a common sub-feature to be leveraged by the 1.7 StatefulSet update and DaemonSet update features.