Skip to content

ztunnel: add xds control-plane#42169

Merged
julianwiedmann merged 4 commits intomainfrom
pr/rgo3/ztunnel-xds-controlplane
Nov 7, 2025
Merged

ztunnel: add xds control-plane#42169
julianwiedmann merged 4 commits intomainfrom
pr/rgo3/ztunnel-xds-controlplane

Conversation

@rgo3
Copy link
Copy Markdown
Contributor

@rgo3 rgo3 commented Oct 14, 2025

This PR acts as a first part in a series of PRs to introduce native ztunnel integration into Cilium, enabling Cilium to act as a control plane for the standalone ztunnel proxy. This PR provides both a certificate authority (CA) server for mTLS certificate management and an xDS control plane for workload discovery, as well as some initial configuration options to enable this functionality.

It should be noted that the provided CA server is mainly suitable for testing and smaller deployments.

Please see the individual commits and their respective messages for more detailed descriptions of the changes.

@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Oct 14, 2025
@rgo3 rgo3 force-pushed the pr/rgo3/ztunnel-xds-controlplane branch 15 times, most recently from 373e3d5 to b6523df Compare October 20, 2025 15:49
@joestringer joestringer added the release-note/major This PR introduces major new functionality to Cilium. label Oct 20, 2025
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Oct 20, 2025
@rgo3 rgo3 force-pushed the pr/rgo3/ztunnel-xds-controlplane branch from b6523df to f260816 Compare October 21, 2025 10:06
@ldelossa ldelossa force-pushed the pr/rgo3/ztunnel-xds-controlplane branch 3 times, most recently from 550bf77 to b2d7bc7 Compare October 28, 2025 00:59
@ldelossa
Copy link
Copy Markdown
Contributor

/test

@ldelossa ldelossa force-pushed the pr/rgo3/ztunnel-xds-controlplane branch from b2d7bc7 to 0122dcc Compare October 28, 2025 13:47
@ldelossa
Copy link
Copy Markdown
Contributor

/test

@ldelossa ldelossa marked this pull request as ready for review October 28, 2025 15:18
@ldelossa ldelossa requested review from a team as code owners October 28, 2025 15:18
@rgo3 rgo3 mentioned this pull request Oct 31, 2025
@rgo3 rgo3 force-pushed the pr/rgo3/ztunnel-xds-controlplane branch 2 times, most recently from 861e0c4 to 2ffada3 Compare November 3, 2025 22:08
Copy link
Copy Markdown
Member

@gandro gandro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for the code owned by my codeowners. I didn't re-review the ztunnel code

@rgo3
Copy link
Copy Markdown
Contributor Author

rgo3 commented Nov 6, 2025

/test

rgo3 and others added 3 commits November 6, 2025 17:38
Introduce basic controlplane structure for ztunnel integration using
a new Cell in the Hive framework. This provides the foundation for
implementing ztunnel control logic. It also sets new adds new codeowners
for the ztunnel package.

Signed-off-by: Robin Gögge <[email protected]>
Add the necessary scaffolding and xDS certificate authority server for
usage with ztunnel.

Following the pattern of IPSec's key injection, we update the Cilium
daemonset to wait on a secret-backed volume mount named
"cilium-ztunnel-secrets".

The secret includes 4 items, all of which are PEM encoded.
    1. Bootstrap Certificate (bootstrap-root.crt)
    2. Bootstrap Private Key (bootstrap-private.key)
    3. CA Certificate (ca-root.crt)
    4. CA Private Key (ca-private.key)

The bootstrap items are used to boostrap a TLS connection between
ztunnel and the CA server introduced into Cilium.

The CA items are used to create and sign certificates given
a certificate signing request from ztunnel.

The CA server implements the necessary gRPC server and methods expected
by ZTunnel. See the `github.com/cilium/cilium/ztunnel/pb` package for more
details.

Signed-off-by: Louis DeLosSantos <[email protected]>
Signed-off-by: Robin Gögge <[email protected]>
This commit introduces a minimal xDS (Extensible Discovery Service)
control plane implementation enabling Cilium to act as a control plane
for the standalone ztunnel proxy. This implementation bridges Cilium's
endpoint management with ztunnel's workload discovery requirements.

Background:

ztunnel is Istio's zero-trust tunnel proxy that handles L4 secure
communication between workloads using HBONE (HTTP-Based Overlay Network
Environment). To function, ztunnel requires a control plane that
implements the Istio Workload API to discover workloads and services
in the cluster. This commit enables Cilium to serve as that control
plane.

Implementation Details:

The xDS control-plane implements the Delta Aggregated Discovery Service
protocol, which is a bidirectional gRPC stream between Cilium and ztunnel.
It provides transformation logic between Cilium's endpoint model and
Istio's Workload API, and subscribes to Cilium's existing K8s watchers
(K8sCiliumEndpointsWatcher) to receive real-time updates about
clusterwide endpoint lifecycle events.

Protocol Flow:

  1. ztunnel connects and sends DeltaDiscoveryRequest for Address resources
  2. Cilium responds with initial seed of all workloads on the node
  3. StreamProcessor subscribes to endpoint events via resource.Store
  4. As endpoints change, updates are batched and streamed to ztunnel
  5. ztunnel ACKs/NACKs each response via nonce matching

Dependencies:

This commit explicitly vendors the Istio Workload API protobuff file:
  - istio.io/istio/pkg/workloadapi: Protobuf definitions for Workload,
    Service, and Address types

Co-authored-by: Hemanth Malla <[email protected]>
Signed-off-by: Robin Gögge <[email protected]>
Signed-off-by: Louis DeLosSantos <[email protected]>
@rgo3 rgo3 force-pushed the pr/rgo3/ztunnel-xds-controlplane branch from 2ffada3 to 1436aa1 Compare November 6, 2025 16:38
Signed-off-by: Hemanth Malla <[email protected]>
Co-authored-by: Robin Gögge <[email protected]>
@rgo3 rgo3 force-pushed the pr/rgo3/ztunnel-xds-controlplane branch from 1436aa1 to 8e0ce81 Compare November 6, 2025 17:35
@rgo3
Copy link
Copy Markdown
Contributor Author

rgo3 commented Nov 6, 2025

/test

@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Nov 6, 2025
@julianwiedmann julianwiedmann added this pull request to the merge queue Nov 7, 2025
Merged via the queue into main with commit fe866ec Nov 7, 2025
360 of 362 checks passed
@julianwiedmann julianwiedmann deleted the pr/rgo3/ztunnel-xds-controlplane branch November 7, 2025 06:48
nezdolik pushed a commit to nezdolik/cilium that referenced this pull request Jan 14, 2026
- `go mod tidy && go mod vendor && go mod verify`
- `cd enterprise/hubble-timescape && go mod tidy && cd ../..`
- fixed minor conflicts in `bpf/bpf_lxc.c`, `bpf/bpf_overlay.c` and
  `bpf/lib/nodeport.h` so that both new OSS code and previous Enterprise
  includes are present
- fixed conflicts in `pkg/datapath/config/host_config.go`,
  `pkg/datapath/config/lxc_config.go` and `pkg/datapath/config/overlay_config.go`
- adapted `enterprise/pkg/maps/extepspolicy/table.go`,
  `enterprise/pkg/fqdnha/relay/namemanager.go` and
  `enterprise/pkg/maps/extepspolicy/writer_test.go` due to function
  signature changes in OSS
- `git cherry-pick -n 3d4abeb61b72d910c58ddb199b189c86c4eaf326
  71023768865b9085e6aa8991c553997e1cc6f9b8` to pick up patches from
  @rastislavs (+ manual added fix in
  `enterprise/pkg/bgpv1/manager/reconcilerv2/neighbor_test.go` based on
  patch changes)
- `make -C images update-builder-image update-runtime-image`
- `make -C Documentation update-cmdref`
- `./contrib/scripts/enterprise-testowners.sh`
- remove duplicate `Cleanup Disk space in runner` step in `.github/workflows/cilium-cli.yaml`
- fix mindfulness issues by manually fixing stuff coming from the
  following PRs:
  - cilium#42169
  - cilium#42011
  - cilium#42012
- `make generate-enterprise-apis`
- adjusted `enterprise/pkg/ingresspolicy` after commit 2faed3a
  ("policy: fix selector policy leak and detachment issues") removed the
  implicit addition of the identity on lookup. Now the identity needs to
  be added and removed in the identity manager.
- Set `clustermesh.config.enabled=true` in
  enterprise-clustermesh-overlapping-podcidr workflow following commit
  562ba2c ("clustermesh: set authMode to migration by default").

Signed-off-by: Nicolas Busseneau <[email protected]>
nezdolik pushed a commit to nezdolik/cilium that referenced this pull request Jan 14, 2026
- `go mod tidy && go mod vendor && go mod verify`
- `cd enterprise/hubble-timescape && go mod tidy && cd ../..`
- fixed minor conflicts in `bpf/bpf_lxc.c`, `bpf/bpf_overlay.c` and
  `bpf/lib/nodeport.h` so that both new OSS code and previous Enterprise
  includes are present
- fixed conflicts in `pkg/datapath/config/host_config.go`,
`pkg/datapath/config/lxc_config.go` and
`pkg/datapath/config/overlay_config.go`
- adapted `enterprise/pkg/maps/extepspolicy/table.go`,
  `enterprise/pkg/fqdnha/relay/namemanager.go` and
  `enterprise/pkg/maps/extepspolicy/writer_test.go` due to function
  signature changes in OSS
- `git cherry-pick -n 3d4abeb61b72d910c58ddb199b189c86c4eaf326
  71023768865b9085e6aa8991c553997e1cc6f9b8` to pick up patches from
  @rastislavs (+ manual added fix in
  `enterprise/pkg/bgpv1/manager/reconcilerv2/neighbor_test.go` based on
  patch changes)
- `make -C images update-builder-image update-runtime-image`
- `make -C Documentation update-cmdref`
- `./contrib/scripts/enterprise-testowners.sh`
- remove duplicate `Cleanup Disk space in runner` step in
`.github/workflows/cilium-cli.yaml`
- fix mindfulness issues by manually fixing stuff coming from the
  following PRs:
  - [cilium#42169](cilium#42169)
  - [cilium#42011](cilium#42011)
  - [cilium#42012](cilium#42012)
- `make generate-enterprise-apis`
~- adjusted `enterprise/pkg/ingresspolicy` after commit 2faed3a
  ("policy: fix selector policy leak and detachment issues") removed the
  implicit addition of the identity on lookup. Now the identity needs to
be added and removed in the identity manager.~ Split into separate PR
isovalent/cilium#9506 to ease review and
backporting.
- Set `clustermesh.config.enabled=true` in
  enterprise-clustermesh-overlapping-podcidr workflow following commit
  562ba2c ("clustermesh: set authMode to migration by default").
- Had to revert the following commits because they break the ILB CI
workflow. Thanks to @mhofstetter for bisecting! See discussion for more
details. Upstream fix and re-applying the changes is tracked in
isovalent/cilium#9511.
  - cilium#42986
    - 6781758
    - 3cfe7a1
    - a8fd4ed
    - 64e171e
  - cilium#42973
- c171f22 (with minor conflict
resolution)
    - 9530af5
    - not necessary to revert the last 2 commit of that PR
@aanm aanm added release-note/misc This PR makes changes that have no direct user impact. and removed release-note/major This PR introduces major new functionality to Cilium. labels Jan 23, 2026
@cilium-release-bot cilium-release-bot bot moved this to Released in cilium v1.19.0 Feb 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature/ztunnel ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/misc This PR makes changes that have no direct user impact.

Projects

No open projects
Status: Released

Development

Successfully merging this pull request may close these issues.

Add support for ztunnel [ZTunnel] xDS control-plane integration