-
Notifications
You must be signed in to change notification settings - Fork 5.3k
Update guidance on required field serialization #8486
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -822,50 +822,92 @@ indicated in the Retry-After header, if it is present). | |
|
|
||
| Fields must be either optional or required. | ||
|
|
||
| A field that is required means that the writer must express an opinion about the value of the field. | ||
| Required fields are always present, and readers can rely on the field being present in the object. | ||
|
|
||
| An optional field is one where the writer may choose to omit the field entirely. | ||
| Readers should not assume that the field is present, unless the field also has a server-side default value. | ||
|
|
||
| Default values can only be set on optional fields. | ||
|
|
||
| Optional fields have the following properties: | ||
|
|
||
| - They have the `+optional` comment tag in Go. | ||
| - They are a pointer type in the Go definition (e.g. `AwesomeFlag *SomeFlag`) or | ||
| have a built-in `nil` value (e.g. maps and slices). | ||
| - They are marked with the `omitempty` json struct tag in the Go definition. | ||
| - The API server should allow POSTing and PUTing a resource with this field | ||
| unset. | ||
|
|
||
| In most cases, optional fields should also have the `omitempty` struct tag (the | ||
| `omitempty` option specifies that the field should be omitted from the json | ||
| encoding if the field has an empty value). However, If you want to have | ||
| different logic for an optional field which is not provided vs. provided with | ||
| empty values, do not use `omitempty` (e.g. https://github.com/kubernetes/kubernetes/issues/34641). | ||
| When the field type has a built-in `nil` value, such as a map or a slice, and | ||
| your use case means that you need to be able to distinguish between | ||
| "field not set" and "field set to an empty value", you should use a pointer to | ||
| the type, even though it has a built-in `nil` value. | ||
| See https://github.com/kubernetes/kubernetes/issues/34641. | ||
|
|
||
| Note that for backward compatibility, any field that has the `omitempty` struct | ||
| tag will be considered to be optional, but this may change in the future and | ||
| having the `+optional` comment tag is highly recommended. | ||
| tag, and is not explicitly marked as `+required`, will be considered to be optional. | ||
| This is expected to change in the future, and new fields should explicitly set either | ||
| an `+optional` or `+required` comment tag. | ||
|
|
||
| Required fields have the opposite properties, namely: | ||
| Required fields have the following properties: | ||
|
|
||
| - They do not have an `+optional` comment tag. | ||
| - They do not have an `omitempty` struct tag. | ||
| - They are not a pointer type in the Go definition (e.g. `AnotherFlag SomeFlag`). | ||
| - They mark themselves as required explicitly with a `+required` comment tag. | ||
| - The API server should not allow POSTing or PUTing a resource with this field | ||
| unset. | ||
| - They _typically_ do not use pointer types in the Go definition (e.g. `AnotherFlag SomeFlag`), though required fields where the zero value is a valid value must use pointer types, paired with an `omitempty` struct tag to avoid spurious null serializations. | ||
|
|
||
| For more details on how to use pointers and `omitempty` with fields, see [Serialization of optional/required fields](#serialization-of-optionalrequired-fields). | ||
|
|
||
| Using the `+optional` or the `omitempty` tag causes OpenAPI documentation to | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. q: should this (openAPI documentation) change now that we can have omitempty on required fields?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We will need to make sure that it respects In the long run, we need to make sure all fields are marked as |
||
| reflect that the field is optional. | ||
|
|
||
| ### Serialization of optional/required fields | ||
|
|
||
| Using a pointer allows distinguishing unset from the zero value for that type. | ||
| There are some cases where, in principle, a pointer is not needed for an | ||
| optional field since the zero value is forbidden, and thus implies unset. There | ||
| are examples of this in the codebase. However: | ||
| There are some cases where, in principle, a pointer is not needed for a | ||
| field since the zero value is forbidden, and thus implies unset. | ||
| There are examples of this in the codebase. However: | ||
|
|
||
| - It can be difficult for implementors to anticipate all cases where an empty | ||
| value might need to be distinguished from a zero value. | ||
| - Structs are not omitted from encoder output even where `omitempty` is specified, | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What about omitzero?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Omitzero should work, but as far as I know, we haven't tested this yet, or explored what would happen re-protobuf for built-in types. For now I'd rather not mention omitzero until we've had a chance to test it. I'd see that as a follow up.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. +1 to this -- we should be clear that "encoding" means a lot more than just JSON :) |
||
| which is messy. | ||
|
|
||
| To determine whether a field should be a pointer, consider the following: | ||
|
|
||
| ```mermaid | ||
| graph TD; | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is not in sync with Kal with whenrequired setting, right?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Correct. KAL's optionalfields implementation of For CRDs, because the validation of required vs optional happens at the openapi stage, and for any field where the zero value is not valid, openapi would pick this up before a structured go client were to see the object, we can safely assume for CRDs that for a field where the zero value isn't valid, that the user did not specify the zero value when we see the zero value in the Go struct. With built-in types, all of the validation above is done using Go. And as such, we must be able to distinguish between unset and the zero value to be able to enforce that the zero value is not valid, hence a pointer is always required in these cases. I think we should probably explain this somewhere as an addendum to this advice. |
||
| A[Start] --> B{Is the zero value a valid user choice?}; | ||
| B -- Yes --> C[Use a pointer and omitempty]; | ||
| B -- No --> D{Is the field optional or required?}; | ||
| D -- Optional --> E["Does the field type have a built-in nil value (map or slice)?"]; | ||
| E -- Yes --> F[Do not use a pointer, use omitempty]; | ||
| E -- No --> C; | ||
| D -- Required --> F | ||
| ``` | ||
| There are several implications of the above: | ||
| - For lists and maps where the zero valid is a valid user choice, this means that `[]` and `{}` have a semantically different meaning than unset, in this case it is appropriate to use `*[]T` or `*map[T]S` respectively. | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. See comment above - I am not sure we want to allow this. IIRC protobuf doesn't have a way to represent this without an intermediate struct.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hrmm... you're probably right about proto-persisted things. For typed clients that use go but are backed on the server by a CRD, that means that serialization is always json and this is representable. I agree a pointer to a map or slice is weird. I think it's useful to describe this implication / edge, but would be fine discouraging making use of it ("avoid map or slice fields where
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Have one use case in #8486 (comment), but yes, when I've thought through this before, I've never worked out an issue with doing this in CRDs. If it's an issue for proto, then perhaps the right thing to do here is update the wording to say never do this for built-in (?) types, but it can work for custom resources?
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There are three areas rationale is based on:
2 and 3 are definitely different for custom-resources vs built-in/server-typed resources, and custom resources are actually more predictable since they store as json and don't conflate absent and present-with-zero-value on the server side. I guess we could branch the guidance early on if we think it would lead to different/simpler/better recommendations for one case or the other.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would argue that the word "pointer" has no real meaning in CRD and in fact our API docs should describe a language-neutral data model. We happen to use Go to implement that model, and while Go has the concept of array, nil-slice, empty-slice, and pointer-to-slice, those are not part of our data model. The question then is whether CRD should be able to do things that proto can't, because proto is static and CRD less so? I am fairly sure we could put tag numbers into CRDs and synthesize a proto descriptor, but we don't. I think our data model should be limited to the common subset (basically "what proto can do"), for CRD /and/ built-in, otherwise we end up with divergent API patterns depending on implementation. Blech. We support list (aka repeated) fields. They can have 0 or more items. They are not present or absent, they just have a size.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Do we have any data on how many custom resource APIs for K8s are written not in Go? My assumption has always been that the majority are, and therefore having guidance that applies to the majority, and helps prevent them from making serialization mistakes has value. Perhaps the missing part is that we don't talk about these being Go specific in the guidance?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thinking on this today, does this extend beyond slices and maps? It seems for API server/aggregated types we want to avoid folks creating distinctions between unset and the empty value for lists and maps. And while this works for custom resources, it should also be discouraged since while it works with Go, if they support other languages for clients, it may be difficult for the other languages to represent this. Now I can see a clear use case for allowing zero values for numbers, but what about strings? Do we have valid use cases for For structs, I can think of an example ( I'm planning to submit a follow up for this to clarify our stance here, so I'd like to make sure we cover all data types in our update.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I am sure there are fields where
I am 99.99% sure yes. Proto says
This is represented as
optional messages are supported, even if the message is empty of other tags.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Definitely. Example is a reference to a type that allows specifying API group, where
Strings serialize as tag+length+bytes, so happily serialize zero-length strings (which is why new optional non-pointer string fields immediately mess with proto serialization). StorageClassName *string `json:"storageClassName,omitempty" protobuf:"bytes,5,opt,name=storageClassName"`marshals as: if m.StorageClassName != nil {
i -= len(*m.StorageClassName)
copy(dAtA[i:], *m.StorageClassName)
i = encodeVarintGenerated(dAtA, i, uint64(len(*m.StorageClassName)))
i--
dAtA[i] = 0x2a
}unmarshals as: case 5:
if wireType != 2 {
return fmt.Errorf("proto: wrong wireType = %d for field StorageClassName", wireType)
}
var stringLen uint64
for shift := uint(0); ; shift += 7 {
if shift >= 64 {
return ErrIntOverflowGenerated
}
if iNdEx >= l {
return io.ErrUnexpectedEOF
}
b := dAtA[iNdEx]
iNdEx++
stringLen |= uint64(b&0x7F) << shift
if b < 0x80 {
break
}
}
intStringLen := int(stringLen)
if intStringLen < 0 {
return ErrInvalidLengthGenerated
}
postIndex := iNdEx + intStringLen
if postIndex < 0 {
return ErrInvalidLengthGenerated
}
if postIndex > l {
return io.ErrUnexpectedEOF
}
s := string(dAtA[iNdEx:postIndex])
m.StorageClassName = &s
iNdEx = postIndexThat means proto can distinguish between present and absent on decode, but, unless the go type it is decoding into is a
Our current proto marshaling doesn't treat Request []byte `json:"request" protobuf:"bytes,1,opt,name=request"`marshals as: if m.Request != nil {
i -= len(m.Request)
copy(dAtA[i:], m.Request)
i = encodeVarintGenerated(dAtA, i, uint64(len(m.Request)))
i--
dAtA[i] = 0xa
}unmarshals as: case 1:
if wireType != 2 {
return fmt.Errorf("proto: wrong wireType = %d for field Request", wireType)
}
var byteLen int
for shift := uint(0); ; shift += 7 {
if shift >= 64 {
return ErrIntOverflowGenerated
}
if iNdEx >= l {
return io.ErrUnexpectedEOF
}
b := dAtA[iNdEx]
iNdEx++
byteLen |= int(b&0x7F) << shift
if b < 0x80 {
break
}
}
if byteLen < 0 {
return ErrInvalidLengthGenerated
}
postIndex := iNdEx + byteLen
if postIndex < 0 {
return ErrInvalidLengthGenerated
}
if postIndex > l {
return io.ErrUnexpectedEOF
}
m.Request = append(m.Request[:0], dAtA[iNdEx:postIndex]...)
if m.Request == nil {
m.Request = []byte{}
}
iNdEx = postIndex
Right. We even have zero-field types in a few places where presence / absence drives behavior. // CustomResourceSubresourceStatus defines how to serve the status subresource for CustomResources.
// Status is represented by the `.status` JSON path inside of a CustomResource. When set,
// * exposes a /status subresource for the custom resource
// * PUT requests to the /status subresource take a custom resource object, and ignore changes to anything except the status stanza
// * PUT/POST/PATCH requests to the custom resource ignore changes to the status stanza
type CustomResourceSubresourceStatus struct{}
// CustomResourceSubresources defines the status and scale subresources for CustomResources.
type CustomResourceSubresources struct {
// status indicates the custom resource should serve a `/status` subresource.
// When enabled:
// 1. requests to the custom resource primary endpoint ignore changes to the `status` stanza of the object.
// 2. requests to the custom resource `/status` subresource ignore changes to anything other than the `status` stanza of the object.
// +optional
Status *CustomResourceSubresourceStatus `json:"status,omitempty" protobuf:"bytes,1,opt,name=status"`
...for that to be usable in the resulting go object, the field must be a pointer to be able to distinguish presence / absence
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ok great, so we have examples of zero values being accepted for most types, and that working for proto, so it appears it really is just I'll post an update to discourage distinguishing between absent and zero length lists/maps |
||
| - For `bool` types, the zero value is `false`, which is always a valid user choice. `bool` types should always be pointers, and should always use the `omitempty` tag. | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hi, I wanted to ask about this. For maps and slices which have a built-in nil value, such as a map or a slice, and your use case means that you need to be able to distinguish between "field not set" and "field set to an empty value", you should use a pointer to the type, even though it has a built-in nil value. But otherwise not. Woudn't similar logic applies for
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
In this case, you have several options. Field is omitted, no validation occurs. Field is present and empty - this should be rejected as the map/list requires at least one entry. Field is present and has at least one entry, all good. For Go types, the first and the second are the same if you do not have a pointer. Omitempty would mean that the field isn't serialized and so the second can't actually happen/ is not representable in Go. But it can happen for unstructured types which is why the minimum entries validation must exist. For bools, false is always a valid choice, so you need to be able to round trip false for every bool, even if that means the same as omitted. |
||
| - When a field is required, and the zero value is not valid, a structured client who has not expressed an explicit choice will have their request rejected by the API server based on the invalid value, rather than the field being unset. | ||
| - For example, a string with a minimum length of 1; Validation would not understand if the field was unset, or set to the empty string deliberately, but would still reject the request because it did not meet the length requirements. | ||
| - Technically, using a pointer in these cases is also acceptable, but not advised as it makes coding more complex, and increases the risk of nil pointer exceptions. | ||
| - In these cases, not using `omitempty` provides the same result, but pollutes the marshaled object with zero values and is not recommended. | ||
| - For structs, the zero value can only be valid when the struct has no required fields, and does not require at least one property to be set. | ||
| - Required structs should use `omitzero` to avoid marshalling the zero value. | ||
|
|
||
| #### Serialization of custom resources | ||
|
|
||
| When custom resources are admitted by the API server, openapi validation is applied to the object _prior_ to any structured client observing the object. | ||
|
|
||
| - it can be difficult for implementors to anticipate all cases where an empty | ||
| value might need to be distinguished from a zero value | ||
| - structs are not omitted from encoder output even where omitempty is specified, | ||
| which is messy; | ||
| - having a pointer consistently imply optional is clearer for users of the Go | ||
| language client, and any other clients that use corresponding types | ||
| For a field where the zero value is not valid, the openapi validation will reject the object if the field is present and set to the zero value, | ||
| before a controller or validation webhook could observe the field. | ||
|
|
||
| Therefore, we ask that pointers always be used with optional fields that do not | ||
| have a built-in `nil` value. | ||
| This means that there is no need to distinguish in a custom resource, between unset and the zero value for fields where the zero value is not valid. | ||
| In these cases, pointers are not needed, as the zero value indicates to the structured client that the field is unset. | ||
|
|
||
| This can be beneficial to API authors, as it reduces the complexity of the API, and reduces the risk of nil pointer exceptions in controllers. | ||
|
|
||
| ## Defaulting | ||
|
|
||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIK we do not have any pointer-to-{slice,map} fields and I have told people in the past that we DO NOT want them. This makes it sound like it would be OK, but I don't think it would (I strongly suspect some codegen would break). I was actually a bit surprised that JSON unmarshal handles it properly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One use case I have seen is
*[]corev1.Taint. Where the use case will default "I have no opinion" to include a set of default taints, but, if you specifically do not want those taints, it documents that you should explicitly provide an empty list[]for the field.Do you think that would not actually be possible then? What would you recommend instead for this use case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect it doesn't work as described. We tried to do this in network policy and gave up because it did not work.
The only answer I know, and the way we did it in straight proto (caveat: it's been a while) is to put a (pointer to) struct in between. Nil struct means default. Non-nul means the list is present. All languages can decode that into something intelligible.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure about builtin types, for CRDs it looks like controller-gen works (we generate deep copy, conversion and CRD YAMLs). I just replaced a custom MarshalJSON func for []corev1.Taint (to preserve
[]) with*[]corev1.Taintand it worked out of the box