Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Accept OpenAPI spec instead of JSON Schema for CRD validation #49879

Closed
mbohlool opened this issue Jul 31, 2017 · 25 comments
Closed

Accept OpenAPI spec instead of JSON Schema for CRD validation #49879

mbohlool opened this issue Jul 31, 2017 · 25 comments
Labels
sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery.

Comments

@mbohlool
Copy link
Contributor

mbohlool commented Jul 31, 2017

According to this proposal:

https://github.com/kubernetes/community/blob/master/contributors/design-proposals/customresources-validation.md

We accept JSON Schema for validation. Kubectl and other tools we have, uses OpenAPI to validate objects (as well as other custom validations on the server). While The proposal claims that OpenAPI and JSON Schema are compatible, but they are not. OpenAPI's Schema is a subset of JSON Schema (e.g. it does not support multiple values for type field). We would be in much better shape if we ask for the OpenAPI spec of the custom resource. In that way, we can just aggregate the Spec with current OpenAPI spec and kubectl and other tools can use that to validate the resource. If we accept a JSON Schema, it would not be possible to convert it to OpenAPI if user provides any incompatible values for that Schema.

@nikhita, @sttts @lavalamp

@k8s-github-robot
Copy link

@mbohlool
There are no sig labels on this issue. Please add a sig label by:

  1. mentioning a sig: @kubernetes/sig-<group-name>-<group-suffix>
    e.g., @kubernetes/sig-contributor-experience-<group-suffix> to notify the contributor experience sig, OR

  2. specifying the label manually: /sig <label>
    e.g., /sig scalability to apply the sig/scalability label

Note: Method 1 will trigger an email to the group. You can find the group list here and label list here.
The <group-suffix> in the method 1 has to be replaced with one of these: bugs, feature-requests, pr-reviews, test-failures, proposals

@k8s-github-robot k8s-github-robot added the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Jul 31, 2017
@mbohlool
Copy link
Contributor Author

/sig api-machinery

@k8s-ci-robot k8s-ci-robot added the sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. label Jul 31, 2017
@k8s-github-robot k8s-github-robot removed the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Jul 31, 2017
@sttts
Copy link
Contributor

sttts commented Jul 31, 2017

Copying my answer from kubernetes/community#708 (comment):

Why should we restrict ourselfes to openapi? The openapi spec is a side-product of this validation and I don't want to loose expressivity for validation because openapi hasn't updated to the JSON schema draft 4 yet.

Moreover, some day openapi v3 is finalized, we will support v3 and v2 for backward compatibility. Do you suggest that we stay at the common denominator, i.e. to that JSON schema subset supported by v2? In other words, at some point we will have to diverge anyway.

Is there anything that stops us from stripping the given spec from unsupported features for consumation by the openapi aggregator? As far as I see all JSON Schema fields are monotonic in the sense that stripping fields makes the spec weaker. That's what we need for easy openapi consumation.

@sttts
Copy link
Contributor

sttts commented Jul 31, 2017

@mbohlool can you summerize the v3 restrictions regarding JSON Schema draft 4?

@sttts
Copy link
Contributor

sttts commented Jul 31, 2017

@mbohlool Regarding my comment above: which API guarantees do we have regarding the served OpenAPI spec? Do we have to version that? Or do we expect that every consumer now suddenly support v3 and doesn't fall over the new schema?

@nikhita
Copy link
Member

nikhita commented Jul 31, 2017

@mbohlool From the OpenAPI v3 specification: https://github.com/OAI/OpenAPI-Specification/blob/master/versions/3.0.0.md#properties, the only difference between the types that are mentioned in the proposal and the OpenAPI types is the type field.

We use an array of strings (instead of a single string) right now but it can be changed. Also, patternProperties can be changed to accommodate OpenAPI.

Do you believe there are any further problems with the types that we have now? IMHO it should work with OpenAPI well after these changes.

@mbohlool
Copy link
Contributor Author

mbohlool commented Jul 31, 2017

There are differences in v2 and v3 spec. Why do we want to say we accept JSON Schema while we really accept the subset that is OpenAPI v3 (and meanwhile till we upgrade kubernetes to OpenAPI v3, OpenAPI v2) defined. I think we better just get OpenAPI spec from user and validate based on that (kubectl is already doing that). We would accept v2 now, when we upgrade kubernetes to accept v3, we would accept v3 instead of v2 in CRDs too (and would have the same level of backward compatibility we will foreseen for the whole kubernetes).

@sttts
Copy link
Contributor

sttts commented Aug 3, 2017

We have analysed the different specs:

  • JSON-Schema draft 4,
  • OpenAPI v2,
  • OpenAPI v3,
  • go-openapi's support as of today.

Here is a spread-sheet with the results: https://docs.google.com/spreadsheets/d/1Mkm9L7CXGvRorV0Cr4Vwfu0DH7XRi24YHPiDK1NZWo4/edit?usp=sharing

These are the main takeaways:

  • go-openapi supports most of JSON Schema draft 4, only lacks some (optional) formats

  • go-openapi supports most of OpenAPI v3, only lacks some documentary fields: nullable, writeOnly, deprecated

  • go-openapi supports all of OpenAPI v2.

  • OpenAPI v2 lacks the anyOf operator. Removing this from a CRD spec makes a lot specifications impossible. this hurts. anyOf basically allows to express alternatives, e.g. to have two variants or versions embedded in a CR spec.

  • We can easily drop any anyOf fields when exporting to the OpenAPI spec. Every object validating under go-openapi will validate under the served OpenAPI spec then. So this is consistent, though not complete (until we serve OpenAPI v3 spec). But consistency is what is required. Completeness is a nice to have.

    => We have a trivial, consistent conversion to OpenAPI spec.

  • If we go with full JSON-Schema draft 4, we will have to translate:

    • array items
    • possibly no type-matching defaults
    • array types

    => It's not so obivous how to map these to OpenAPI, not even to v3. So in the near future we won't have a good OpenAPI v3 mapping for the OpenAPI v3 spec.

My conclusion from this:

We should go with OpenAPI v3 and for the time being drop everything in the OpenAPI spec that is not in OpenAPI v2. But in the near future – when go-openapi supports v3 completely and we have switched to that version, we will be completely clean and have a real 1:1 mapping.

@sttts
Copy link
Contributor

sttts commented Aug 3, 2017

Now the technical part of a change from JSON-Schema to OpenAPI v3:

Let's assume we want to be OpenAPI v3 compatible in the validation. We also want to be able to upgrade some day, e.g. to v4 as supported by go-openapi.

Then we have these choices:

  1. leave it as it is and do everything implicitly, i.e. silently allow more features of JSON Schema / OpenAPI
  2. use the $schema field and some URL to declare which standard is used https://some-openapi-schema-url/v3.
  3. turn CustomResourceValidation.JSONSchema into a RawExtension and define a JSONSchema object with APIVersion and Kind. Then we can use Kind: "OpenAPISpecV3"
  4. rename CustomResourceValidation.JSONSchema to CustomResourceValidation.OpenAPISpecV3 and add CustomResourceValidation.OpenAPISpecV4 some day.

@nikhita
Copy link
Member

nikhita commented Aug 3, 2017

Re: technical part of change from JSON-Schema to OpenAPI v3:

  1. things could change anyhow in the future. What if some field gets deleted? Removing a field would be a breaking change.
  2. we could do this but would need to change the structure as the standard changes. For one structure, there is really only one $schema value possible.
  3. and 4. seems good. But will this create a "bloat"? As standards keep on changing, we'll have more and more types...

@sttts
Copy link
Contributor

sttts commented Aug 3, 2017

@deads2k @enisoc @bgrant0607 @brendandburns @mbohlool please comment on the upper two design decisions for CRD validation:

A) which JSON Schema subset #49879 (comment)
B) how to express different JSON Schema subset versions #49879 (comment)

cc @kubernetes/sig-api-machinery-api-reviews

@deads2k
Copy link
Contributor

deads2k commented Aug 3, 2017

I would go ahead and create flavors of validation schema (option 4), so that we can easily have a backwards compatible update later on.

I'm less opinionated on which standard to use. I would have defaulted to jsonschema, but if that makes for bad conversion problems I'm ok with OpenAPISpecV3. It does seem a bit like the tail wagging the dog though.

@mbohlool
Copy link
Contributor Author

mbohlool commented Aug 8, 2017

I like option 4. With that option, we can also have OpenAPISpecV2 today. Users can provide either OpenAPISpecV3 or OpenAPISpecV2.

  • If they provide OpenAPISpecV2 we would use that today and up-convert it when we support OpenAPI v3.
  • If they provide OpenAPISpecV3 we would down-convert it now and when we support OpenAPI v3, use that directly.
  • If they provide both, we would use the appropriate one based on the version of the OpenAPI we support at the time.
  • We can also start deprecating old versions. e.g. deprecating v2 when we support v4 (or even v3), etc.

@sttts
Copy link
Contributor

sttts commented Aug 8, 2017

Sounds like option 4 is the chosen one. One addition: we do not default the empty fields. But we might migrate some day when V2 is removed.

@nikhita
Copy link
Member

nikhita commented Aug 8, 2017

@mbohlool Do you think a user should be allowed to provide two schemas i.e. one for v2 and one for v3? (If yes, we will have to validate against both).

@sttts
Copy link
Contributor

sttts commented Aug 8, 2017

I would only allow one.

@mbohlool
Copy link
Contributor Author

mbohlool commented Aug 8, 2017

I agree with @sttts. we should only allow one.

Furthermore, I see from your change that you are using (a copy of) go-openapi's SchemaProps.
First, why a copy, why not use the type directly? (We need to keep it in sync when a complete support of v3 added to go-openapi if we keep a copy).

Second, I think we should use Definitions not SchemaProps. Definitions is a map from a name to Schema. This way we support references too (most of OpenAPI specs have multiple definitions and references inside definitions to other definitions. There is a big chance that users use our OpenAPI spec generator that creates a definition for each type and sub-type and reference between them, this is common for other generators too). To find the definition to start with, you can look at the map element with the key of full qualified CRD name. To reuse your existing code, you can convert this to a definition with no reference using a simple recursive method.

@nikhita
Copy link
Member

nikhita commented Aug 8, 2017

First, why a copy, why not use the type directly?

It was decided that we should not allow our types to be dependent on external types. kubernetes/community#708 (comment).

Second, I think we should use Definitions not SchemaProps.

Maybe I am missing something here but both the v3 and v2 specifications say that Definitions is not supported?

@mbohlool
Copy link
Contributor Author

mbohlool commented Aug 8, 2017

I meant this definition: https://github.com/go-openapi/spec/blob/f81e0f768713035fe232cd0d44a1c56e4eef1383/swagger.go#L190

Or you can look at it as a map[string]SchemaProps I guess. The point is to support references like "$ref": "#/definitions/Metdata". It would be very limiting if we don't support that.

@sttts
Copy link
Contributor

sttts commented Aug 9, 2017

Furthermore, I see from your change that you are using (a copy of) go-openapi's SchemaProps.
First, why a copy, why not use the type directly? (We need to keep it in sync when a complete support of v3 added to go-openapi if we keep a copy).

There was a long discussion in the proposal. For a number of reasons we decided against reusing go-openapi's types. Main argument: we don't control them. I am strongly against changing this.

Second, I think we should use Definitions not SchemaProps. Definitions is a map from a name to Schema. This way we support references too (most of OpenAPI specs have multiple definitions and references inside definitions to other definitions. There is a big chance that users use our OpenAPI spec generator that creates a definition for each type and sub-type and reference between them, this is common for other generators too). To find the definition to start with, you can look at the map element with the key of full qualified CRD name. To reuse your existing code, you can convert this to a definition with no reference using a simple recursive method.

I also strongly disagree here. We are not defining an OpenAPI type universe here, but define the CRD schema. That it's compatible with OpenAPI is a side-produce (with a lot of value, I don't question that). A Schema is a SchemaProps. If we want to allow references (which I am not against at all), we have to generate the names automatically from the CRD name and type. This is nothing for the user to decide.

@enisoc
Copy link
Member

enisoc commented Aug 9, 2017

Is it still a non-goal to actually implement OpenAPI generation for CRD right now? If so, I think we should focus on offering the best interface for the goal at hand -- declarative specification of server-side Validate() behavior.

For the purpose of specifying server-side validation, it should be enough to support only OpenAPIV3 right now, since it's significantly better than OpenAPIV2 (e.g. anyOf). If we lose some fidelity when later implementing conversion to OpenAPIV2, that's ok. Autogenerating client-side validation from server-side validation is only a nice-to-have anyway; you can't autogenerate anything if server-side validation is native code or a webhook.

On a minor note, I suggest calling the field OpenAPIV3Schema instead of OpenAPISpecV3, because we are not accepting a full OpenAPI Spec, only the Schema object as defined in that spec.

@sttts
Copy link
Contributor

sttts commented Aug 10, 2017

OpenAPIV3Schema instead of OpenAPISpecV3

+1

We (I hope @mbohlool volunteers 😃) will do OpenAPI generation as a follow-up, not sure whether in 1.8 though.

I am fine with having v2 or dropping it. Some day v4 will be there and then we have the second spec. @nikhita has implemented the additional v2 checks already and the code is pretty simple and no burden for maintenance. So I leave the decision to @mbohlool who stressed the point to have the v2 spec as well.

@mbohlool
Copy link
Contributor Author

mbohlool commented Aug 11, 2017

My suggestion to have v2 and v3 were mostly relevant if we were going the path of accepting more generic OpenAPI spec. In current form, I would also agree that providing two set of spec would make things more complicated and it is unnecessary (sorry @nikhita that my suggestion make your life harder for this PR). Let's remove v2 and keep OpenAPISpecV3 only. We can do best effort to convert it to v2 on CRD OpenAPI generation and point it out to CRD users that if they want a complete and valid OpenAPI spec being generated for their Resource, they should not use V3 features like oneOf.

We (I hope @mbohlool volunteers 😃) will do OpenAPI generation as a follow-up, not sure whether in 1.8 though.

I don't think this would be included in 1.8 (I can always try 😃). It shouldn't be hard to do it if we figure out the aggregation layer updating problem (that should be fixed in 1.8).

k8s-github-robot pushed a commit that referenced this issue Aug 30, 2017
Automatic merge from submit-queue

apiextensions: validation for customresources

- [x] Add types for validation of CustomResources
- [x] Fix conversion-gen: #49747
- [x] Fix defaulter-gen: kubernetes/gengo#61
- [x] Convert to OpenAPI types
- [x] Validate CR using go-openapi
- [x] Validate CRD Schema
- [x] Add integration tests
- [x] Fix round trip tests: #51204 
- [x] Add custom fuzzer functions
- [x] Add custom conversion functions
- [x] Fix data race while updating CRD: #50098 
- [x] Add feature gate for CustomResourceValidation
- [x] Fix protobuf generation

Proposal: kubernetes/community#708
Additional discussion: #49879, #50625

**Release note**:

```release-note
Add validation for CustomResources via JSON Schema.
```

/cc @sttts @deads2k
@nikhita
Copy link
Member

nikhita commented Aug 30, 2017

I believe this can be closed now.

@sttts
Copy link
Contributor

sttts commented Aug 30, 2017

Yes, it's OpenAPI v3 now.

@sttts sttts closed this as completed Aug 30, 2017
sttts pushed a commit to sttts/apiextensions-apiserver that referenced this issue Sep 22, 2017
Automatic merge from submit-queue

apiextensions: validation for customresources

- [x] Add types for validation of CustomResources
- [x] Fix conversion-gen: #49747
- [x] Fix defaulter-gen: kubernetes/gengo#61
- [x] Convert to OpenAPI types
- [x] Validate CR using go-openapi
- [x] Validate CRD Schema
- [x] Add integration tests
- [x] Fix round trip tests: #51204
- [x] Add custom fuzzer functions
- [x] Add custom conversion functions
- [x] Fix data race while updating CRD: #50098
- [x] Add feature gate for CustomResourceValidation
- [x] Fix protobuf generation

Proposal: kubernetes/community#708
Additional discussion: kubernetes/kubernetes#49879, kubernetes/kubernetes#50625

**Release note**:

```release-note
Add validation for CustomResources via JSON Schema.
```

/cc @sttts @deads2k

Kubernetes-commit: 4457e43e7b789586096bfb564330295cf0438e70
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery.
Projects
None yet
Development

No branches or pull requests

7 participants