Skip to content

Enhance content negotiation for zpages#135309

Merged
k8s-ci-robot merged 2 commits intokubernetes:masterfrom
richabanker:zpages
Jan 21, 2026
Merged

Enhance content negotiation for zpages#135309
k8s-ci-robot merged 2 commits intokubernetes:masterfrom
richabanker:zpages

Conversation

@richabanker
Copy link
Copy Markdown
Contributor

@richabanker richabanker commented Nov 14, 2025

What type of PR is this?

/kind cleanup

What this PR does / why we need it:

This PR:

  • Adds YAML and CBOR(gated by CBORServingAndStorage feature gate) serialization support for /flagz and /statusz endpoints
  • Add flagzCodecFactory and statuszCodecFactory wrapper types to filter out unsupported media types (e.g., protobuf) from the codec factory's supported media types list. This ensures that 406 Not Acceptable error messages only show media types that are actually supported by these endpoints
  • Add CBOR and YAML test cases to unit and integration tests

Which issue(s) this PR is related to:

kubernetes/enhancements#4827
kubernetes/enhancements#4828

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Enables YAML support for statusz and flagz.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. area/apiserver sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/instrumentation Categorizes an issue or PR as relevant to SIG Instrumentation. labels Nov 14, 2025
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Nov 14, 2025
@github-project-automation github-project-automation Bot moved this to Needs Triage in SIG Instrumentation Nov 14, 2025
@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 14, 2025
@lmktfy
Copy link
Copy Markdown
Member

lmktfy commented Nov 17, 2025

I think this needs a changelog entry. If not, what's the thinking behind not including one?

Copy link
Copy Markdown
Member

@lmktfy lmktfy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What should we test as e2e?

{
name: "valid request for v1alpha1",
name: "valid request for application/json",
acceptHeader: "application/json;v=v1alpha1;g=config.k8s.io;as=Flagz",
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we could request application/json (just that, no parameters) and / or application/yaml, application/cbor and see what happens.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have that below in the "unsupported application/json without params" testcase

@lmktfy
Copy link
Copy Markdown
Member

lmktfy commented Nov 17, 2025

If CBOR is enabled, can you fetch z pages as CBOR?

@dgrisonnet
Copy link
Copy Markdown
Member

/triage accepted
/assign

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Nov 28, 2025
}
// Protobuf is not supported for zpages endpoints
if mimeType == "application" && mimeSubType == "vnd.kubernetes.protobuf" {
return false
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wouldn't only allowing application/json when the gvk is set be better?

if mimeType == "application" && mimeSubType == "json" {
  return isStructured(gvk)
}

return false

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If someone requests a zpage and their Accept: header indicates a preference for CBOR, and CBOR support is enabled, what should we serve?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dgrisonnet yeah thats more readable, except that I added checks for both json and yaml to the conditional.

@lmktfy I am not entirely sure that we want to support CBOR for statusz/flagz. Both the endpoints serve as debugging endpoints and have relatively small payloads. I also dont foresee users polling these endpoints every few seconds (which if it was the case, would be a good reason to support CBOR for faster parsing of the response). Since statusz and flagz responses are small diagnostic payloads that never make it to storage, I'm not sure there's significant benefit to adding CBOR support beyond consistency with other Kubernetes APIs. But I cant also think of any obvious downsides to it, so I am open to changing my mind :)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also cc @liggitt for this opinion on CBOR support for statusz/flagz

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My (personal) opinion here, not representing any SIG.

Overall I would have it as either an exclusion list ("no Protobuf") or by calling a method to check "is this a text-like format".

So I don't especially support hard coding JSON and YAML as the formats. If we did add, let's say, XML or TOML or something, then the people doing that shouldn't need to look into the z-pages code specifically to find where to change the negotiation rules.

Copy link
Copy Markdown
Member

@liggitt liggitt Dec 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@benluddy is cbor negotiation wired in for things like serialization of discovery formats today?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with supporting json / yaml / cbor consistently. I'm actually fine with specifically allowing those three so we can also pair that with tests of the behavior of those three. Making an exclusion list makes it hard to be confident we have exhaustively tested the outputs we allow.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is cbor negotiation wired in for things like discovery today?

Not today. I don't know why it should not be. Opened a PR to add it to the KEP (kubernetes/enhancements#5740).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good, commented there

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I added CBOR handling (I check for the CBORServingAndStorage feature gate, and if enabled, include the CBOR serializer when creating the codecFactory. Also added a TODO to remove this explicit check from statusz/flagz once we implement this in the codec_factory.go

Please lmk if this is ok or if I should just remove the explicit feature-gate check and enablement of CBOR support for statusz/flag, and instead wait for CBOR serializer to be included in code_factory in future to use it for zpages.

@richabanker richabanker force-pushed the zpages branch 2 times, most recently from 46985fb to 4ccd2a7 Compare December 13, 2025 22:18
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Dec 13, 2025
@liggitt
Copy link
Copy Markdown
Member

liggitt commented Dec 16, 2025

This prevents erroneous 500 http code returned earlier for Accept: application/yaml;v=v1alpha1;g=config.k8s.io;as=Statusz

unsupported requested type should return a 406 error, not a 5xx, by fixing the default case of our switch statement:

		switch serializerInfo.MediaType {
...
		default:

I'm open to accepting application/json, application/yaml, and application/cbor if it's as simple as expanding our switch statement here (I think all three serializers should be able to serialize resulting object):

		switch serializerInfo.MediaType {
		case "application/json","application/yaml","application/cbor":

Comment thread staging/src/k8s.io/apiserver/pkg/server/flagz/negotiate/negotiation.go Outdated
Comment thread staging/src/k8s.io/apiserver/pkg/server/statusz/negotiate/negotiation.go Outdated
}
// Protobuf is not supported for zpages endpoints
if mimeType == "application" && mimeSubType == "vnd.kubernetes.protobuf" {
return false
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with supporting json / yaml / cbor consistently. I'm actually fine with specifically allowing those three so we can also pair that with tests of the behavior of those three. Making an exclusion list makes it hard to be confident we have exhaustively tested the outputs we allow.

@k8s-ci-robot k8s-ci-robot added area/test sig/testing Categorizes an issue or PR as relevant to SIG Testing. and removed approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Dec 17, 2025
@richabanker
Copy link
Copy Markdown
Contributor Author

Also, I noticed another thing, the error message returned for an unsupported media was the following:

root@zpages-test-control-plane:/# curl --cacert /etc/kubernetes/pki/ca.crt   --cert /etc/kubernetes/pki/apiserver-kubelet-client.crt  --key /etc/kubernetes/pki/apiserver-kubelet-client.key -H "Accept: application/foo;v=v1alpha1;g=config.k8s.io;as=Statusz" https://localhost:6443/statusz
{
  "kind": "Status",
  "apiVersion": "v1",
  "metadata": {},
  "status": "Failure",
  "message": "only the following media types are accepted: application/json, application/yaml, application/vnd.kubernetes.protobuf, text/plain, application/cbor",
  "reason": "NotAcceptable",
  "code": 406
}

The error included application/vnd.kubernetes.protobuf as a supported media type which is incorrect. I have tried to fix this by adding a wrapper around CodecFactory (statuszCodecFactory & flagzCodecFactory), and exclude protobuf from this wrapper's SupportedMediaTypes() implementation. With this, the error now shows:

curl --cacert /etc/kubernetes/pki/ca.crt   --cert /etc/kubernetes/pki/apiserver-kubelet-client.crt  --key /etc/kubernetes/pki/apiserver-kubelet-client.key -H "Accept: application/foo;v=v1alpha1;g=config.k8s.io;as=Statusz" https://localhost:6443/statusz
{
  "kind": "Status",
  "apiVersion": "v1",
  "metadata": {},
  "status": "Failure",
  "message": "only the following media types are accepted: application/json, application/yaml, text/plain, application/cbor",
  "reason": "NotAcceptable",
  "code": 406
}

Happy to also revise this if this is not the preferred approach to fix the wrong list of supported types in the error.

Copy link
Copy Markdown
Member

@liggitt liggitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a couple comments, looks reasonable otherwise (also apply to the statusz file)

m.Handle(DefaultFlagzPath, handleFlagz(componentName, reg, filteredCodecFactory, negotiate.FlagzEndpointRestrictions{}))
}

func (f *flagzCodecFactory) SupportedMediaTypes() []runtime.SerializerInfo {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this called on every request to /flagz during negotiation? if so, let's calculate this once and cache/return to avoid reallocating on every request

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah it was. Now caching it in flagzCodecFactory once and resuing that, thanks for the tip!

allTypes := f.CodecFactory.SupportedMediaTypes()
filtered := make([]runtime.SerializerInfo, 0, len(allTypes))
for _, info := range allTypes {
if info.MediaType == runtime.ContentTypeProtobuf {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prefer both positive and negative matching so any future additions get detected / handled

  • include text, json, yaml, cbor
  • exclude proto
  • consider panicing on another unknown type when running tests so we catch this and make a decision about whether to support a new format for this

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. Done and added a unit test to check for no errors when creating the flagzCodecFactory and statuszCodecFactory

@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jan 7, 2026
@liggitt
Copy link
Copy Markdown
Member

liggitt commented Jan 21, 2026

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 21, 2026
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

LGTM label has been added.

DetailsGit tree hash: 859e15a9b6eef56b933d811f9723789475f9687c

@k8s-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: liggitt, richabanker

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 21, 2026
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

@richabanker: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-kubernetes-integration a9dec0c link unknown /test pull-kubernetes-integration
pull-kubernetes-e2e-gce a9dec0c link unknown /test pull-kubernetes-e2e-gce

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@k8s-triage-robot
Copy link
Copy Markdown

Retesting failed PR that otherwise appears ready for merge.

Please help us fix flaky tests by following our Flaky Tests Guide.

Prevent this bot from retesting with /lgtm cancel or /hold.
For this robot's configuration, see here.

/retest-required

@k8s-ci-robot k8s-ci-robot merged commit 6fde485 into kubernetes:master Jan 21, 2026
11 of 13 checks passed
@k8s-ci-robot k8s-ci-robot added this to the v1.36 milestone Jan 21, 2026
@github-project-automation github-project-automation Bot moved this from Needs Triage to Done in SIG Instrumentation Jan 21, 2026
@richabanker richabanker deleted the zpages branch January 22, 2026 05:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. area/apiserver area/test cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/instrumentation Categorizes an issue or PR as relevant to SIG Instrumentation. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

8 participants