Skip to content

ztunnel: add ZDS server#42364

Merged
ldelossa merged 3 commits intomainfrom
pr/rgo3/ztunnel-zds-server
Dec 3, 2025
Merged

ztunnel: add ZDS server#42364
ldelossa merged 3 commits intomainfrom
pr/rgo3/ztunnel-zds-server

Conversation

@rgo3
Copy link
Copy Markdown
Contributor

@rgo3 rgo3 commented Oct 23, 2025

This PR introduces the Ztunnel Discovery Service (ZDS) server implementation in Cilium, enabling integration with ztunnel. The ZDS server provides a protocol-based interface for communicating Cilium-managed endpoints to ztunnel, allowing ztunnel to establish inpod proxies for mTLS workload traffic.

The implementation adds infrastructure for managing iptables rules within pod network namespaces when ztunnel operates in inpod mode, redirecting traffic through ztunnel for processing. To support this, endpoints now track their pinned network namespace paths, which are captured during CNI plugin operations.

A reconciler-based enrollment system manages which endpoints participate in ztunnel's mTLS capabilities based on namespace membership. This declarative approach uses StateDB to maintain enrollment state and automatically handles endpoint lifecycle events, ensuring consistency across agent restarts through initial snapshot reconciliation. The reconciler filters out ineligible endpoints and prevents self-enrollment of ztunnel pods, while supporting bulk enrollment and disenrollment operations when namespace membership changes.

@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Oct 23, 2025
@rgo3 rgo3 self-assigned this Oct 28, 2025
@rgo3 rgo3 force-pushed the pr/rgo3/ztunnel-zds-server branch 3 times, most recently from 7be867e to a3e7330 Compare October 31, 2025 08:53
@rgo3 rgo3 changed the base branch from main to pr/rgo3/ztunnel-xds-controlplane October 31, 2025 12:27
@rgo3 rgo3 force-pushed the pr/rgo3/ztunnel-xds-controlplane branch 4 times, most recently from 1436aa1 to 8e0ce81 Compare November 6, 2025 17:35
Base automatically changed from pr/rgo3/ztunnel-xds-controlplane to main November 7, 2025 06:48
@rgo3 rgo3 force-pushed the pr/rgo3/ztunnel-zds-server branch 4 times, most recently from 0c94658 to be2f6cb Compare November 14, 2025 07:56
@rgo3 rgo3 added release-note/major This PR introduces major new functionality to Cilium. feature/ztunnel labels Nov 14, 2025
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Nov 14, 2025
@rgo3
Copy link
Copy Markdown
Contributor Author

rgo3 commented Nov 14, 2025

/test

@rgo3 rgo3 force-pushed the pr/rgo3/ztunnel-zds-server branch from be2f6cb to 1106cd7 Compare November 14, 2025 10:59
@rgo3 rgo3 marked this pull request as ready for review November 14, 2025 11:01
@rgo3 rgo3 requested review from a team as code owners November 14, 2025 11:01
@ldelossa ldelossa requested a review from squeed November 21, 2025 16:50
Copy link
Copy Markdown
Member

@jrajahalme jrajahalme left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, a few questions though.

@rgo3 rgo3 force-pushed the pr/rgo3/ztunnel-zds-server branch from 21a570e to 48b1506 Compare November 28, 2025 13:08
@rgo3
Copy link
Copy Markdown
Contributor Author

rgo3 commented Nov 28, 2025

/test

@rgo3 rgo3 force-pushed the pr/rgo3/ztunnel-zds-server branch from 48b1506 to 8073fe6 Compare December 1, 2025 23:55
@rgo3
Copy link
Copy Markdown
Contributor Author

rgo3 commented Dec 1, 2025

/test

@rgo3 rgo3 force-pushed the pr/rgo3/ztunnel-zds-server branch from 8073fe6 to f01a167 Compare December 2, 2025 13:45
@rgo3
Copy link
Copy Markdown
Contributor Author

rgo3 commented Dec 2, 2025

/test

@rgo3 rgo3 force-pushed the pr/rgo3/ztunnel-zds-server branch from f01a167 to 2fe5012 Compare December 2, 2025 15:23
@rgo3
Copy link
Copy Markdown
Contributor Author

rgo3 commented Dec 2, 2025

/ci-integration

@rgo3
Copy link
Copy Markdown
Contributor Author

rgo3 commented Dec 2, 2025

/test

rgo3 and others added 3 commits December 2, 2025 17:47
This introduces a new API for managing iptables rules when ztunnel
runs in inpod mode. The CreateInPodRules() function configures the
necessary network plumbing within a pod's network namespace to redirect
traffic to ztunnel for processing.

This also vendors the coreos/go-iptables library as a dependency for
managing iptables rules programmatically.

Signed-off-by: Robin Gögge <[email protected]>
Co-authored-by: Quang Nguyen <[email protected]>
This change introduces a Ztunnel Discovery Service (ZDS) server in
cilium. The server can communicate cilium-managed endpoints to ztunnel
in order for ztunnel to be able to setup its inpod proxies.

The ZDS server implementation that is added here provides an API, which
can be consumed by 3rd party code to enroll/disenroll endpoints with
ztunnel.

For this to be possible the endpoint.Endpoint object now has a field to
store the pinned netns path of a pod. This field is set from the cni
plugin on CNI ADD events.

For reference, the ZDS protocol can be found here:
https://github.com/istio/ztunnel/blob/master/proto/zds.proto

Signed-off-by: Robin Gögge <[email protected]>
Co-authored-by: Quang Nguyen <[email protected]>
Introduce a reconciler-based system for managing endpoint enrollment to ztunnel
based on namespace membership. The implementation uses StateDB to maintain an
EnrolledNamespace table that tracks which namespaces should have their endpoints
participating in ztunnel's mTLS capabilities.

The reconciler subscribes to the endpoint manager and reacts to endpoint lifecycle
events, enrolling endpoints when they are created in enrolled namespaces and
disenrolling them upon deletion. On startup, it waits for endpoint restoration to
complete before sending an initial snapshot of all eligible endpoints in enrolled
namespaces to ztunnel, ensuring consistency after agent restarts.

Endpoints are filtered to exclude those without network namespace paths and ztunnel
pods themselves to prevent self-enrollment. When a namespace is added to the enrolled
set, all existing endpoints in that namespace are enrolled in bulk. Conversely, when
a namespace is removed, all its endpoints are disenrolled. This table-driven approach
provides declarative enrollment management and simplifies recovery from transient
failures through automatic reconciliation.

Signed-off-by: Robin Gögge <[email protected]>
Co-authored-by: Quang Nguyen <[email protected]>
@rgo3 rgo3 force-pushed the pr/rgo3/ztunnel-zds-server branch from 2fe5012 to 4feb4d2 Compare December 2, 2025 16:47
@rgo3
Copy link
Copy Markdown
Contributor Author

rgo3 commented Dec 2, 2025

/test

@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Dec 2, 2025
@ldelossa ldelossa added this pull request to the merge queue Dec 3, 2025
Merged via the queue into main with commit 8c95135 Dec 3, 2025
394 of 397 checks passed
@ldelossa ldelossa deleted the pr/rgo3/ztunnel-zds-server branch December 3, 2025 16:51
@aanm aanm added release-note/misc This PR makes changes that have no direct user impact. and removed release-note/major This PR introduces major new functionality to Cilium. labels Jan 23, 2026
@cilium-release-bot cilium-release-bot bot moved this to Released in cilium v1.19.0 Feb 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature/ztunnel ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/misc This PR makes changes that have no direct user impact.

Projects

No open projects
Status: Released

9 participants