-
Notifications
You must be signed in to change notification settings - Fork 1.1k
feat/configurable ports #5540
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat/configurable ports #5540
Conversation
| value: "{{ .Values.webhooks.port }}" | ||
| {{- end}} | ||
| {{- if and .Values.webhooks.enabled .Values.webhooks.port }} | ||
| - name: "WEBHOOK_PORT" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this looks repeat of above ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, removed the duplicate
| value: ":{{ .Values.metrics.port }}" | ||
| {{- end}} | ||
| {{- if .Values.readiness.port }} | ||
| - name: "HealthProbeBindAddress" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HEALTH_PROBE_BIND_ADDRESS to match the env defined in command struct below ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed it, thanks
| - protocol: TCP | ||
| port: 9443 | ||
| targetPort: 9443 | ||
| port: {{ .Values.webhooks.port | default "9443" }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the webhook Service port here itself, can remain the same 9443, without the need for it to be configurable right ?
Only the crossplane pod's webhook port, that is Service's targetPort below, needs configurability..
Changing the Service port (which IMO unnecessary) would require changes in other places like
| - name: "WEBHOOK_SERVICE_PORT" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right @ravilr, I didn't think about this before, but the service port can remain the same. I will refactor this again, but I need your input on this, as the override makes it a bit awkward:
- In the runtime config the user can provide a service port with a port diverging from this
- The runtime config just embeds the upstream corev1.ServiceSpec so we can't really prevent this from happening
Would you prefer
- the
runtime_defaults.go#serviceFromRuntimeConfigmethod to override it if present, - a new default override
ServiceWithPort(serviceName string, port int)to be added to theallOverridesmethod
I'm unsure which one is actually less surprising when reading the code. I think I'm leaning towards the second one
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
In case of crossplane core cli, the newly being added
--webhook-portarg appears to be unambiguous already, capturing the webhook server listener port in the crossplane pod's core container. So, i think the crossplane chart can continue hard-coding the webhookServicespec port to 9443 as before. -
In case of pkg DeploymentRuntimeConfig, 2nd option from your proposal sounds good to me, since we are already expecting the DRC's serviceTemplate.spec override to be matching on the port name
webhook.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The chart now hard-codes 9443 as the service port and an override was added to ensure the port is always 9443. Test case was adjusted accordingly
|
Thank you for the review, I'm currently on sick leave and will get back to you as soon as I am able to. |
|
@ravilr can you advise me on the failed checks please?
|
|
changes LGTM. Thanks. Still requires review from the maintainers. @Argannor do you want to remove the Reg. CI job failures, may be maintainers can help re-kick them to run again. |
39b3b18 to
869d767
Compare
|
Rebased the branch on top of master to resolve merge conflicts. However I can't get the E2E tests to run on my local machine, also it seems like these tests fail on other pull requests as well, therefore I suspect that it's not related to my changes. Kindly requesting a review from one of the maintainers |
|
While deploying this change in our test cluster, I had to adjust the With these changes and my PR to provider-kubernetes crossplane-contrib/provider-kubernetes#301 I was able to
while running EKS using Calico as the CNI. Looking forward to your feedback |
| ports: | ||
| - name: readyz | ||
| containerPort: 8081 | ||
| containerPort: {{ .Values.readiness.port | default "8081" }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
K8s spec expects this containerPort to be of Integer type: https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.31/#containerport-v1-core
unquote the default as it was before, also may be test helm template rendering if .Values.readiness.port | int is required ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unquoted the value and verified that | int is not needed, even if the user specifies the value as a quoted string.
|
@Argannor did you happen to have an example of what a |
|
@jbw976 sure thing here you go: # K8s provider
apiVersion: pkg.crossplane.io/v1
kind: Provider
metadata:
name: provider-kubernetes
spec:
package: "index.docker.io/crossplanecontrib/provider-kubernetes:v0.15.0"
runtimeConfigRef:
name: provider-kubernetes
---
apiVersion: pkg.crossplane.io/v1beta1
kind: DeploymentRuntimeConfig
metadata:
name: provider-kubernetes
spec:
deploymentTemplate:
spec:
selector: { }
template:
spec:
hostNetwork: true
nodeSelector:
kubernetes.io/os: linux
imagePullSecrets:
- name: crossplane-workaround
containers:
- name: package-runtime
args:
- --webhook-port=9610
serviceTemplate:
spec:
ports:
- name: webhook
port: 9200 # doesn't matter
targetPort: 9610
protocol: TCPAnd more or less the same for functions as well: apiVersion: pkg.crossplane.io/v1beta1
kind: Function
metadata:
name: function-patch-and-transform
spec:
package: xpkg.upbound.io/crossplane-contrib/function-patch-and-transform:v0.7.0
runtimeConfigRef:
name: function-patch-and-transform
---
apiVersion: pkg.crossplane.io/v1beta1
kind: DeploymentRuntimeConfig
metadata:
name: function-patch-and-transform
spec:
deploymentTemplate:
spec:
selector: { }
template:
spec:
hostNetwork: true
containers:
- name: package-runtime
args:
- --address=:9612
serviceTemplate:
spec:
ports:
- name: webhook
port: 9200 # doesn't matter
targetPort: 9612
protocol: TCPAnd the values for the crossplane helm chart: hostNetwork: true
dnsPolicy: "ClusterFirstWithHostNet"
webhooks:
port: 9600
metrics:
enabled: true
port: 9601
readiness:
port: 9602Edit: removed irrelevant parameters and added values.yaml |
jbw976
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @Argannor for your diligence on this feature and your patience waiting for it to be merged. And thank you @ravilr for your attention and reviews so far!
I took this for a test spin today and ran into some obstacles, so perhaps you can provide some insight on if these are real issues or are an artifact of the testing environment or manifests/config I'm using 😇
I checked out your feat/configurable-ports branch locally and I'm using earthly +hack to build/deploy Crossplane locally into a kind cluster with the following changes intended to capture the helm values you provided:
.hack/helm install --create-namespace --namespace crossplane-system crossplane .hack/charts/crossplane-0.0.0-hack.tgz \
--set "image.pullPolicy=Never,image.repository=crossplane-hack/crossplane,image.tag=hack" \
- --set "args={--debug}"
+ --set "args={--debug}" \
+ --set "hostNetwork=true,dnsPolicy=ClusterFirstWithHostNet,webhooks.port=9600,metrics.enabled=true,metrics.port=9601,readiness.port=9602"
Crossplane seems to come up OK and is honoring those port configurations:
ports:
- containerPort: 9602
name: readyz
protocol: TCP
- containerPort: 9601
name: metrics
protocol: TCP
- containerPort: 9600
name: webhooks
protocol: TCP
However, I'm having trouble with the provider and function.
provider-kubernetes
The provider-kubernetes pod is in a CrashLoopBackOff with this in its logs:
crossplane-kubernetes-provider: error: unknown long flag '--webhook-port', try --help
- Looks like it's missing your changes from crossplane-contrib/provider-kubernetes#301. Is a built provider with those changes published to a public repo somewhere that I could use?
- Out of curiosity, what is the
imagePullSecrets: - name: crossplane-workaroundI see in your manifest? - the ports with a comment
# doesn't matter- can those be dropped from the manifest completely then if the value doesn't matter? 🤔
function-patch-and-transform
The function pod is in a Pending state, and I see this event for the pod:
Warning FailedScheduling 4m54s (x5 over 34m) default-scheduler 0/1 nodes are available: 1 node(s) didn't have free ports for the requested pod ports. preemption: 0/1 nodes are available: 1 No preemption victims found for incoming pod.
Is it possible that the manifests are specifying the wrong port here? Your manifest specifies webhook, but should it be grpc instead? Or is this an artifact of using a kind cluster for my testing? Not sure what the root cause of this port conflict is and what should be done to fix 🤔
Here's a snippet showing ports and args from kubectl get pod:
spec:
containers:
- args:
- --address=:9612
env:
- name: TLS_SERVER_CERTS_DIR
value: /tls/server
image: xpkg.upbound.io/crossplane-contrib/function-patch-and-transform:v0.7.0
imagePullPolicy: IfNotPresent
name: package-runtime
ports:
- containerPort: 8080
hostPort: 8080
name: metrics
protocol: TCP
- containerPort: 9443
hostPort: 9443
name: grpc
protocol: TCP
Let me know how I can improve my testing environment and config to more accurately evaluate this PR, thank you! 🙇♂️
|
Hi @jbw976, thank you for testing these changes! I only used a "real" cluster to test the changes, as I'm currently unable to run kind on my machine. Let me try to answer your questions: provider-kubernetes
function-patch-and-transform
apiVersion: pkg.crossplane.io/v1beta1
kind: DeploymentRuntimeConfig
metadata:
name: function-patch-and-transform
spec:
deploymentTemplate:
spec:
selector: { }
template:
spec:
hostNetwork: true
containers:
- name: package-runtime
args:
- --address=:9612
ports:
- containerPort: 9612
hostPort: 9612
name: grpc
protocol: TCP
- containerPort: 9613
hostPort: 9613
name: metrics
protocol: TCP
serviceTemplate:
spec:
ports:
- name: webhook
port: 9200 # doesn't matter
targetPort: 9612
protocol: TCP |
I was able to build your branch of
I can confirm that the The
Is that an issue that the pod's ports for In general, how were you testing to validate that the webhook port is accessible and functional? That could help my testing here too 😁 |
|
I also tested again with Without specifying any ports in the deployment template (only specifying them in the service template), I expected the crossplane composition machinery to not be able to reach the function because the However, it still worked 😮 - this must be because we're passing |
Service to Pod port mappingFor me this was not an issue and we decided to do it that way to keep the complexity of this change lower see this discussion above, as modifying the service port would demand for additional changes. As to why this works, honestly I don't know. I verified it on our AWS EKS cluster with Calico, and the control plane (running outside of the overlay network provided by Calico) reached the container running in the host network on the pods port. But I don't understand how that works. Probably some kind of kubeproxy/iptables magic going on there. And the ports are not in conflict with each other as each service gets their own IP address. Testing doneIn our test cluster (again AWS EKS + Calico), I deployed:
All of them running in the hostNetwork and ports manually adjusted (as outlined in my previous examples). Then I wrote a Composition for the function and saw that crossplane and the function could talk to each other and worked as expected. For provider-kubernetes it was a little bit more complicated, so if you deploy an old version (pre 0.11.0 so v0.10.0) and deploy a simple I have no idea how to replicate this with kind, as you would somehow need to create this split network, where workloads and control plane are not in the same network. Regarding function-patch-and-transform and the port mysteryKubernetes doesn't really care too much about the ports declared in a container, just like docker doesn't really care about the ports exposed in the Dockerfile:
Kubernetes API Reference Container Spec So it's a nice to have to actually declare them, and I would even say that it depends on the concrete runtime if a port gets opened or not but I'm not sure about it. So now we know that kind creates a port forwarding for all declared ports. I'm not sure if the "normal" kubelet does the same. |
jbw976
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to be working with my testing in a local kind instance for scenarios where I set specific ports and where I don't specify anything at all - so it seems the plain vanilla scenarios haven't regressed as far as I can tell 💪
Documentation and examples will be important for this feature when it's merged - that will help folks adopt and use this very easily, e.g. the examples you provided me for my testing streamlined my usage.
I'm still a bit uncomfortable not knowing how the networking mapping is working still, both in your AWS environment and my kind environment, even when the service target port doesn't match the pod/container port. Your explanation of Kubernetes doesn't really care too much about the ports declared in a container could be sufficient for our usage though 😇
I have a few comments including in this review, mostly around the API we're exposing and the usage of webhook vs grpc for Functions. Let me know what you think. Thanks for your continued efforts on this! 🙇♂️
| ServiceWithSelectors(b.podSelectors()), | ||
| ServiceWithAdditionalPorts([]corev1.ServicePort{ | ||
| { | ||
| Name: webhookPortName, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this Service() function is used in the context of both providers and functions. For Functions, the port name is grpc, so always setting it to webhook could cause some confusion. Is there a better way to set this to the appropriate name the pod will be using?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will move the service overrides to the calling side, so that we can have different port names for functions (grpc) and providers (webhook)
|
|
||
| endpointPort := servicePort | ||
| for _, port := range svc.Spec.Ports { | ||
| if port.Name == webhookPortName { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this further adds to the confusion of function webhook vs. grpc - the intent here is to find the grpc port for the function, but we're calling it a webhook port (i think 😇)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As the port is now named grpc we can use grpcPortName here instead too :)
| Ports: []corev1.ServicePort{ | ||
| { | ||
| Name: webhookPortName, | ||
| Port: int32(webhook.DefaultPort), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it odd that we are using webhook.DefaultPort for the comparison here, when our code is setting it explicitly using servicePort instead? those two values happen to be the same, but maybe it's still odd to use this for the comparison?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
whoops, yeah that's confusing and will hurt later. I'll change it.
cmd/crossplane/core/core.go
Outdated
| MetricsBindAddress string `default:":8080" env:"METRICS_BIND_ADDRESS" help:"The address the metrics server listens on." placeholder:"host:port"` | ||
|
|
||
| HealthProbeBindAddress string `default:":8081" env:"HEALTH_PROBE_BIND_ADDRESS" help:"The address the health probe endpoint binds to." placeholder:"host:port"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's feeling a bit odd to expose these as addresses in the command line args, but as port numbers in the helm chart values. What do you think about consolidating them and only exposing as port numbers everywhere?
Is there a use case where the user would need to set the host also here in addition to the port? i'm not sure there is, but if there is, we'd probably need to make a change to the helm chart values later anyways to make that easier to set since we aren't exposing it there now.
My instinct is to just expose port numbers everywhere unless there's a clear use case for setting the host too 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no specific use case, I just used them as it was the more flexible approach and those where the arguments that had to be passed on. Changed it to port numbers only
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
both the metric and health bind address flags are passed-into/consumed by the k8s-sigs/controller-runtime's Manager Options, Options
the argument towards keeping them as address string is: Some Security focussed setups would not want to expose the metrics port on all interfaces, instead only on 127.0.0.1:8080 and then front it with a rbac-proxy sidecar which requires auth for the prometheus scrapers to be able to reach it.
|
Thank you for your thorough review! I agree that documentation is necessary for this scenario. Do you want me to add a PR for that as well? In that case I'd ask you to guide me where (which repository) to do that, and where to put the added docs (as in where in the documentation). I addressed your comments above and adjusted the code accordingly. Except for the comment about webhook service port vs webhook port, as I don't see the need for that (s. my comment above). |
|
Sorry, forgot to address the check-diff question. No I was wondering about that one too. Git doesn't detect any changes, besides I ran Maybe a line-ending issue? I'm working on Windows + WSL, so that wouldn't be the first time where line endings get messed up. |
From https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.31/#serviceport-v1-core : So, if the targetPort in the Service is specified as the correct listening port number of the target pod container in question, Service->pods routing/forwarding doesn't rely on the target pod's container ports spec. Only when specifying the targetPort in the Service as a named port, they do have to be matching with the target pod's container ports spec for the lookup to be successful. Considering the confusion this might cause for users wanting to customize ports for their setups, to keep the configuration surface simple, i'm wondering whether exposing If the pkg runtime Service builder methods are updated to leverage the Could you think of any other For example, to override a provider deployment ports, one would just need to specify: apiVersion: pkg.crossplane.io/v1beta1
kind: DeploymentRuntimeConfig
metadata:
name: provider-upjet-aws-rds
spec:
deploymentTemplate:
spec:
selector: {}
template:
spec:
hostNetwork: true
containers:
name: package-runtime
ports:
- containerPort: 18001
name: webhook
protocol: TCP
- containerPort: 19001
name: metrics
protocol: TCP
env:
- name: WEBHOOK_PORT
value: "18001"
- name: METRICS_BIND_ADDRESS
value: ":19001"and the pkg runtime Service builder would be just hardcoding the targetPort to |
|
Also @Argannor, i don't think one have to run Function runtime pods in hostNetwork(hence no ports customization required). Are you running on EKS and finding Crossplane-pod-in-hostNetwork not able to talk to -> function grpc pod running in pod network ? In setups where the pod network is configured to use custom CNI in overlay mode (could be Calico overlay, Cilium overlay, etc..), Pod-to-Pod communications (and Pod-in-HostNetwork to Pod-in-Overlay-network) themselves work over the overlay network enabled by the respective CNI, so Crossplane pod -> Function pods grpc endpoints works seamlessly(with appropriate Networkpolicy[1], if it is being enforced) . But, the the EKS APIServer don't have direct connectivity(route) to talk to the pods on the overlay network. In case of kube APIServer(or EKS Managed control plane) -> provider conversion webhook (or crossplane core webhook) communications, running provider(or crossplane core) in hostNetwork, hence ports customization to avoid port conflicts with other providers deployment, are required. In case of Crossplane pod -> function grpc communication, there is no kube APIServer/kube control-plane involved right ? [1] NetworkPolicy allowing all ingress and egress on grpc port 9443 in crossplane-system namespace, where the crossplane-core pod and the function pods are deployed to: apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-functions-egress-ingress
namespace: crossplane-system
spec:
policyTypes:
- Ingress
- Egress
podSelector: {}
ingress:
- ports:
- port: 9443
protocol: TCP
egress:
- ports:
- port: 9443
protocol: TCP |
You're perfectly right, I think I got confused by the fact that we needed to adjust the port in the To verify that I created a small gist, using
Where the client container was able to first perform an nslookup to get the overlay network IP and then reach the nginx.
That's a good question... The API server gets involved in the situation as soon as the function tries to manage multiple versions of their own CRDs and need to provide custom versions. But that wouldn't be happening via GRPC, so we would actually need to change the code again to accommodate for an optional webhook port as well. Or am I mistaken here?
I don't see any other use case, so using the port names directly could make it a less confusing solution, and reduce the verbosity of the necessary config. I like that suggestion. |
|
Update on using port names: In order for GRPC to know which port the function is listening on we need to set the port as part of the This makes things more difficult, as the service needs to be setup in the
Do you have any better ideas? I don't particularly like the prospects of either of those, but the first option could be the lesser evil. |
i don't think Function runtime pods would ever have webhook listener, only grpc port.
since there is no need to run function runtime pods on hostNetwork, hence no ports customization required. So, the servicePort==targetPod containerPort==9443 should just work, as it right now before this PR change, right ? |
Who manages CRDs of functions? Could you please elaborate your reasoning for me?
Sorry I was too focused on making it work. You're right, if we don't allow for customization, and it's never run on the host network (which shouldn't be required) we can again hard code it. But then why is the port configurable in the first place (via the SDK) and should we in this case also prevent customization of the container ports, since that would lead to an invalid configuration? |
As of today, Function Input CRDs exists and gets deployed through function packages to crossplane cluster, just for capturing the input schema of the function. They are never used for storing CRs of that CRD kind into the kube-api storage. As such, the conversion webhook for function input CRDs doesn't make sense ? cc @negz @jbw976 to clarify on whether Function runtime pods would ever have Kube-apiserver->webhook-listener-on-function-pod communication call pattern.
Are you referring to |
Signed-off-by: Jared Watts <[email protected]>
|
Regardless of the outcome of this discussion, I
and verified in our test cluster again, that the communication works between:
Let me know what you think about this. |
|
Excuse my persistence, but what do you think of the latest changes? Do these adress your concerns? |
|
@Argannor @ravilr great work continuing to push on this effort since the last time I reviewed. I had to focus on a few other things with the v1.18 release and Kubecon, but after Kubecon settles down I'm planning on taking another pass through this. I would love to get this merged early in the v1.19 milestone so it can bake a bit and definitely be included as a popular feature of 1.19! 🙇♂️ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @Argannor @ravilr for the work on this PR since the last time I reviewed, as well as your patience. Let's see if we can make a push to get this in for v1.19! code freeze is currently Jan 28, so let's try to get this finished and have something to ship ![]()
A few notes to share:
- overall, i'm liking the simplicity that you've found in the moving parts and the surface area that is exposed. Thanks for the ideas @ravilr on what didn't actually need implementing (e.g. configurable function gRPC ports)
- looks like @ravilr's idea to use named ports and completely avoiding exposing a service template spec worked OK? I'm concluding that because I don't see the
DeploymentRuntimeConfigAPI being touched at all anymore. I love that simplification so we don't have to expose an entire config object with lots of knobs. - Do you have some updated manifests that are representative of the testing you've been doing @Argannor? e.g. what you provided me in #5540 (comment) was really helpful for my testing, and now that the surface area / UX has been simplified a lot, it would be helpful if you can once again share what you're using to test so I can use that as a starting point for my testing 😉
- your testing results look very encouraging so far for your setup! crossplane (host network) -> functions (cluster network) and API server (outside cluster network) -> providers (host network)
- I'm personally only aware of the function address being configurable in the SDK and function template for local dev/test scenarios.
cc @negz @jbw976 to clarify on whether Function runtime pods would ever have Kube-apiserver->webhook-listener-on-function-pod communication call pattern.
As you all surmised, it's not expected for instances/resources to ever be created of the input CRDs that functions expose. Those CRDs were never intended to be installed into the control plane actually, the package manager unexpectedly found them and installs them. This is tracked in #5294. Therefore, I don't see any need to support webhook port configuration for functions, at least right now.
I'll take this for a test pass once you share some manifests @Argannor, then give the code another deeper look over. Just wanted to share these thoughts now as we're ramping up to get this included in v1.19! 🙏 💪
|
@jbw976 if you're looking to take the latest changes(without the ---
apiVersion: pkg.crossplane.io/v1beta1
kind: DeploymentRuntimeConfig
metadata:
name: provider-kubernetes
spec:
deploymentTemplate:
spec:
selector: {}
template:
spec:
hostNetwork: true
nodeSelector:
kubernetes.io/os: linux
imagePullSecrets:
- name: crossplane-workaround
containers:
- name: package-runtime
args:
- --webhook-port=9610 # customize webhook port to hostNetwork 9610; default is 9443, if unset
- --metrics-bind-address=":9611" # customize metrics port to hostNetwork 9611; default is 8080, if unset
ports:
- containerPort: 9610 # should be set to same thing set in above arg
name: webhook
protocol: TCP
- containerPort: 9611 # should be set to same thing set in above arg
name: metrics
protocol: TCP As for functions runtimeConfig, there isn't a need for running function pods in hostNetwork, as the crossplane pod->function pod communications are within the cluster network (all CNIs enable direct connectivity from cluster node's hostNetwork -> cluster pod network), hence no need for any port customization. As such the recommendation for folks running custom CNI deployments would be to continue to use the Crossplane's default rendered function runtime spec. |
Signed-off-by: Jared Watts <[email protected]>
Signed-off-by: Jared Watts <[email protected]>
jbw976
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was able to test this more thoroughly today and look through all the code as well. The behavior I'm expecting to see appears to all be working from my testing with a local kind cluster 🎉 You can see my testing notes in this gist:
https://gist.github.com/jbw976/0fb9c7070a541dffaa286883de13c26b
I've added some comments that are mostly around converging the code changes with the final design/solution we've landed on. Take a look at those and let me know what you think.
Code freeze for v1.19 is now Feb 4 on the community calendar and I'm definitely hoping we get this in! 🙇
@Argannor would you also be able to update the PR description to make sure it accurately captures the final code changes/design?
| if tmpl.Metadata.Name != nil { | ||
| svc.Name = *tmpl.Metadata.Name | ||
| } | ||
| if meta := tmpl.Metadata; meta != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think this refactoring is still lingering from when we were allowing the entire ServiceTemplate spec also - now that we're back to just the metadata, we can probably put this function back to exactly how it was before. 0 changes 😇
| want want | ||
| }{ | ||
| "ProviderServiceNoRuntimeConfig": { | ||
| reason: "No overrides should result in a deployment with default values", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this test case is slightly confusing for me because it says "no overrides", but still does pass overrides below, and it also seems to be mentioning provider things but then passing an service override for gRPC which is a function thing.
Any clean-up you think you should do for this unit test?
|
@ravilr, thanks for the manifest provided in #5540 (comment)! To get it working, I only had to remove the quotes around |
|
Thank you @jbw976 for the review. Unfortunately I'm currently out with a fever and not sure if I can incorporate your review until the code freeze. Feel free to take over. |
* expose crossplane ports for metrics, webhooks, and health probes in helm chart and CLI args * allow providers/functions to have their ports configured through a DeploymentRuntimeConfig Signed-off-by: Argannor <[email protected]> Signed-off-by: Jared Watts <[email protected]>
|
@Argannor i'm sorry to hear you are feeling ill! I hope you recover soon 🙏 I've taken all your changes and squashed/rebased them to the latest on I'm doing some more testing now and I'll look through the code once again, but if all that looks good, I'll force push to your branch and then merge this PR! |
Signed-off-by: Jared Watts <[email protected]>
96dab3d to
a3978ae
Compare
jbw976
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've finished my final testing, squash/rebased the changes, and I think this is good to go for v1.19! 🚀
Thanks again @Argannor for your diligence and patience on this PR which is a highly demanded feature, and thank you @ravilr for your thorough reviews and insights to simplify this implementation as well. 🙇♂️
Part of crossplane/crossplane#5520 As outlined in the above issue, when running Crossplane in certain environments (like EKS using Calico CNI/Cilium in overlay network), where the control plane resides outside the workload network it is necessary to run webhooks in the host network. This calls for the need to make the listening ports(webhook and metrics) of provider controllers configurable to avoid port conflicts among different provider families pods. How has this code been tested Deployed these changes along with the changes from feat/configurable ports crossplane/crossplane#5540, which allows customizing ports in DeploymentRuntimeConfig, and verified that a provider launched with these new flags with custom ports and corresponding DeploymentRuntimeConfig, had conversion webhook communications working without errors. Signed-off-by: ravilr <[email protected]>
Description of your changes
When running crossplane (and providers/functions) in the host network the need to configure ports arises.
This MR has two parts:
DeploymentRuntimeConfigto override the default ports (metrics,webhook,grpc) in bothDeploymentandService.Additional merge requests would be necessary to
Putting this on draft to gather your feedback. Please let me know what you think about it and how to proceed with the follow ups.
Contributes to #5520
Crossplane Ports
The following configuration options (flags / environment variables) where added, with their defaults set to the current values:
--webhook-port=9443--metrics-bind-address=:8080--health-probe-bind-address=:8081Additionally these options were added to the values of the helm charts defaulting to unset:
webhooks.portmetrics.portreadiness.portNote that for the configuration options I chose to stick as close as possible to the underlying API for increased flexibility (thus the bind-address choice), while the helm chart is more oriented towards a consistent user experience (thus ports for all 3 options). Let me know what you think about it!
Provider, Functions, Service Ports
The core idea is to allow the user to specify the ports in the DeploymentRuntimeConfig (DRC), by setting
spec.deploymentTemplate.spec.template.spec.containers[0].portsand passingargsto the container to tell the provider which ports to bind to. An example DRC can be found at https://github.com/jbw976/crossplane/blob/configurable-ports-review/provider-k8s.yaml.Here's a brief summary of the changes:
DeploymentWithOptionalPodScrapeAnnotationsuses the port number of a givenmetricsport if set in the DRC deployment template.DeploymentRuntimeWithAdditionalPortsonly adds ports not contained in the DRC deployment template (based on comparing their names).ServiceWithAdditionalPortsonly adds ports not contained in the DRC service template (based on comparing their names).Nameattribute of webhook ports in services towebhook(to facilitate the name comparison).Unit tests were added.
Housekeeping
I have:
make reviewableto ensure this PR is ready for review.Added or updated e2e tests.Linked a PR or a docs tracking issue to document this change.Addedbackport release-x.ylabels to auto-backport this PR.Need help with this checklist? See the cheat sheet.