KEP for flagz page for Kubernetes Components#4831
KEP for flagz page for Kubernetes Components#4831k8s-ci-robot merged 1 commit intokubernetes:masterfrom richabanker:component-flagz
Conversation
|
/assign |
|
|
||
| ### Data Format and versioning | ||
|
|
||
| Initially, the flagz page will exclusively support a plain text format for responses. We will implement these endpoints using versioned URLs, like /v1/flagz, to ensure future compatibility. This versioning strategy allows us to seamlessly introduce structured data formats (e.g., JSON) in the future, through a distinct endpoint such as /v2/flagz, without disrupting existing implementations. |
There was a problem hiding this comment.
Wouldn't it be better to have the endpoints in the form of /flagz/v1? Flagz is the subdomain that we want to version, if we were to put the version first, that would be confusing with the component version as it would then be at the top-level.
There was a problem hiding this comment.
Yes you're right, I got that the other way round. Fixed it now, thanks!
There was a problem hiding this comment.
One thing I was wondering since then is whether going with a query parameter might not be better. Something like:
/flagz?version=1
/flags?version=2
wdyt?
There was a problem hiding this comment.
This was also raised in the statusz KEP PR. I am ok with either, whichever seems the best to indicate the version for the endpoint.
There was a problem hiding this comment.
Updated the KEP to use query params for specifying the version.
| that might indicate a serious problem? | ||
| --> | ||
|
|
||
| - apiserver_request_duration_seconds metric that will tell us if there's a spike in request latency of kube-apiserver that might indicate that the flagz endpoint is interfering with the component's core functionality or causing delays in processing other requests |
There was a problem hiding this comment.
IIRC you won't get this metric for free, you'll have to make sure that the handler for the flagz endpoint is instrumented.
There was a problem hiding this comment.
Umm unsure about how flagz handler relates with this metric which I thought is enabled by default? The idea was to use this metric to monitor general load of requests that apiserver handles and use that as a signal to check if flagz is consuming all system resources causing delays in apiserver's capability to serve other resource requests..
though yeah I guess that wouldnt be a clean signal to detect issues with the flagz endpoint in a deterministic way.. Do you suggest introducing a new metric that keeps track of request latency for just /flagz requests ?
There was a problem hiding this comment.
I forgot to submit my comment on your PR, but you already dealt with that concern of mine in your POC: kubernetes/kubernetes#127581 (comment)
|
cc @johnbelamaric for PRR. Hi John, could you please review this KEP for prod-readiness whenever you get a chance? Thanks a lot! |
|
/assign @johnbelamaric whoops meant to add as assignee |
|
/assign |
|
|
||
| 1. **No sensitive data exposed** | ||
|
|
||
| We will ensure that no sensitive data is exposed through flagz and that access to this endpoint is gated by using system-monitoring group. |
There was a problem hiding this comment.
how? Do flag definitions allow us to express the idea that the flag is "sensitive" ?
There was a problem hiding this comment.
Do we need to push that up to pflag? One could argue that falgs should NEVER have sensitive information, since they are visible via ps.
There was a problem hiding this comment.
Do we need to push that up to pflag?
Or.. maybe we could create a separate set of "sensitive flags" and while printing out the values in the flagz handler, redact the values for those flags?
One could argue that falgs should NEVER have sensitive information, since they are visible via ps
That's indeed true. But if we still feel that exposing this info in an endpoint will be easier for attackers to get hold of, we can impose some restrictions on what data to expose in the endpoint.
There was a problem hiding this comment.
Do we have any examples of these sorts of flags?
There was a problem hiding this comment.
Perhaps the TLS related flags.. (also waiting for someone from sig-auth to help answer that..)
There was a problem hiding this comment.
I'm not aware any flags in long-lived components (kube-apiserver, kubelet, scheduler, controller-manager, etc) that contain sensitive information directly. For things like TLS or credentials, they always point at files which contain the data.
There are shorter-lived invocations that take flags containing sensitive information (like kubeadm init with --certificate-key or --token), and components built that bind client-go user flags (like kubectl) can specify a token credential on the command-line using --token. Both of those have the option to consume those values from a file as well.
pflag does have the ability to annotate flags, which we apparently use like this to mark some flags as "classified":
// AddSecretAnnotation add secret flag to Annotation.
func (f FlagInfo) AddSecretAnnotation(flags *pflag.FlagSet) FlagInfo {
flags.SetAnnotation(f.LongName, "classified", []string{"true"})
return f
}and then filter out here when building an annotation to include in API objects when the client chooses to record the change cause in an annotation (--record)
parseFunc := func(flag *pflag.Flag, value string) error {
flags = flags + " --" + flag.Name
if set, ok := flag.Annotations["classified"]; !ok || len(set) == 0 {
flags = flags + "=" + value
} else {
flags = flags + "=CLASSIFIED"
}
return nil
}There was a problem hiding this comment.
Great! Thanks for pointing out the pflag annotation feature. That's perfect for displaying the non-CLASSIFIED flags.
|
|
||
| #### Request | ||
| * Method: **GET** | ||
| * Endpoint: **v1/flagz** |
There was a problem hiding this comment.
this should be updated once we've agreed on the versioning format we want to follow
mattbailey
left a comment
There was a problem hiding this comment.
shadow prod readiness review
Just some nits/clerical (checkboxes), otherwise PRR looks good.
johnbelamaric
left a comment
There was a problem hiding this comment.
Some minor points on PRR, but looks OK to me.
|
PRR looks good, I will wait for sig approval to use the magic word |
|
/lgtm |
|
/approve |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: dgrisonnet, johnbelamaric, richabanker The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Uh oh!
There was an error while loading. Please reload this page.