Skip to content

Instrument flagz and statusz endpoints with apiserver request metrics#137021

Merged
k8s-ci-robot merged 2 commits intokubernetes:masterfrom
yongruilin:master_zpages-metrics
Feb 28, 2026
Merged

Instrument flagz and statusz endpoints with apiserver request metrics#137021
k8s-ci-robot merged 2 commits intokubernetes:masterfrom
yongruilin:master_zpages-metrics

Conversation

@yongruilin
Copy link
Copy Markdown
Contributor

@yongruilin yongruilin commented Feb 14, 2026

What type of PR is this?

/kind feature

What this PR does / why we need it:

Instruments the /flagz and /statusz endpoints with metrics.InstrumentHandlerFunc so that requests are recorded in apiserver_request_total and apiserver_request_duration_seconds, matching the existing pattern used by /healthz, /livez, and /readyz. This enables operators to detect potential abuse or DDOS of these zpages endpoints via the standard apiserver HTTP metrics.

Which issue(s) this PR is related to:

KEP: kubernetes/enhancements#4828
kubernetes/enhancements#5806 (comment)

Special notes for your reviewer:

The instrumentation follows the exact same pattern as the health endpoints in staging/src/k8s.io/apiserver/pkg/server/healthz/healthz.go. Tests also follow the existing TestMetrics pattern in healthz_test.go.

Does this PR introduce a user-facing change?

Instrument /flagz and /statusz endpoints with apiserver request metrics (apiserver_request_total, apiserver_request_duration_seconds), with group and version labels reflecting the content-negotiated API version.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

[KEP]: https://github.com/kubernetes/enhancements/tree/master/keps/sig-instrumentation/4828-component-flagz

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. kind/feature Categorizes issue or PR as related to a new feature. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Feb 14, 2026
@k8s-ci-robot k8s-ci-robot added area/apiserver sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/instrumentation Categorizes an issue or PR as relevant to SIG Instrumentation. labels Feb 14, 2026
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Feb 14, 2026
@github-project-automation github-project-automation Bot moved this to Needs Triage in SIG Instrumentation Feb 14, 2026
@lmktfy
Copy link
Copy Markdown
Member

lmktfy commented Feb 14, 2026

I think we should add a changelog entry.

@yongruilin yongruilin force-pushed the master_zpages-metrics branch from 5e72a6c to 767f8cd Compare February 17, 2026 20:01
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. release-note-none Denotes a PR that doesn't merit a release note. labels Feb 17, 2026
@yongruilin
Copy link
Copy Markdown
Contributor Author

I think we should add a changelog entry.

Updated.

@yongruilin
Copy link
Copy Markdown
Contributor Author

/assign @richabanker

…rics

Use MonitorRequest instead of InstrumentHandlerFunc so that group,
version, and deprecated status are derived from content negotiation
on each request. For text/plain requests, group and version are empty.
For structured responses (JSON/YAML/CBOR), they reflect the negotiated
API version (e.g., config.k8s.io/v1alpha1). This ensures accurate
metric labels even when clients request deprecated API versions.

Add TestMetrics for both endpoints, verifying that requests are
recorded in apiserver_request_total with correct label values.
@yongruilin yongruilin force-pushed the master_zpages-metrics branch from 767f8cd to e728229 Compare February 17, 2026 20:44
@Jefftree
Copy link
Copy Markdown
Member

/triage accepted

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Feb 17, 2026
"", // resource
DefaultFlagzPath, // subresource
"", // scope
"", // component
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we put "apiserver" here? Is that what component is supposed to mean?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmmm, the healthz doesn't pass "component" field. But I think we should pass "apiserver" here?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

componentName thats passed to handleFlagz()?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SG, updated.

"", // resource
DefaultStatuszPath, // subresource
"", // scope
"", // component
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment about component

# HELP apiserver_request_total [STABLE] Counter of apiserver requests broken out for each verb, dry run value, group, version, resource, scope, component, and HTTP response code.
# TYPE apiserver_request_total counter
apiserver_request_total{code="200",component="",dry_run="",group="",resource="",scope="",subresource="/flagz",verb="GET",version=""} 1
apiserver_request_total{code="200",component="",dry_run="",group="config.k8s.io",resource="",scope="",subresource="/flagz",verb="GET",version="v1alpha1"} 1
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Its including the "/" that too in subresource.. should that be in resource and without the "/" ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Unlike healthz/livez/readyz, flagz and statusz are defined API resources with their own GVK (config.k8s.io/v1alpha1/Flagz and config.k8s.io/v1alpha1/Statusz), so it makes more sense to use resource="flagz" with an empty subresource rather than putting the path in subresource. Updated

@richabanker
Copy link
Copy Markdown
Contributor

Few nits about lgtm otherwise. Thanks for the help on adding metrics coverage for these endpoints!

@yongruilin yongruilin force-pushed the master_zpages-metrics branch from 435226c to 83959a9 Compare February 26, 2026 21:55
…tusz metrics

Set component label to componentName (e.g., "kube-proxy", "kubelet")
instead of empty string. Move endpoint identifier from subresource to
resource ("flagz"/"statusz") since these are defined API resources with
their own GVK (config.k8s.io/v1alpha1).
@yongruilin yongruilin force-pushed the master_zpages-metrics branch from 83959a9 to c246753 Compare February 26, 2026 22:22
@richabanker
Copy link
Copy Markdown
Contributor

/retest

@richabanker
Copy link
Copy Markdown
Contributor

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 28, 2026
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

LGTM label has been added.

DetailsGit tree hash: bdf3f8ff3832572cf54b8dd3d981bebf940c4fca

@k8s-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: richabanker, yongruilin

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 28, 2026
@k8s-ci-robot k8s-ci-robot merged commit 7ae9c0d into kubernetes:master Feb 28, 2026
13 checks passed
@k8s-ci-robot k8s-ci-robot added this to the v1.36 milestone Feb 28, 2026
@github-project-automation github-project-automation Bot moved this from Needs Triage to Done in SIG Instrumentation Feb 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. area/apiserver cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/instrumentation Categorizes an issue or PR as relevant to SIG Instrumentation. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

5 participants