Support tunnel routing in Multi-Pool IPAM mode#38015
Conversation
2b92eb9 to
07e6d26
Compare
|
/ci-multi-pool |
07e6d26 to
2bb568d
Compare
|
/ci-multi-pool |
2bb568d to
59f8ed8
Compare
|
/ci-multi-pool |
59f8ed8 to
a06a68b
Compare
|
/ci-multi-pool |
a06a68b to
7b9439f
Compare
|
/ci-multi-pool |
763380b to
d836551
Compare
|
/ci-multi-pool |
d836551 to
8067549
Compare
|
/test |
|
👋 Thank you for pushing this forward! Just some (potentially obvious) questions from the peanut gallery:
|
Store allocation CIDRs from each remote node into the ipcache, using the world identity. This sets the ground for using the ipcache for tunneling instead of relying on the additional tunnel map. Co-authored-by: Sebastian Wicki <[email protected]> Signed-off-by: Sebastian Wicki <[email protected]> Signed-off-by: Fabio Falzoi <[email protected]>
This commit adds a unit test asserting the expected behavior for pod
CIDR entries upserted to IPCache by the node manager.
In particular, we also test two corner cases:
- The first case tests a scenario where the pod CIDR is also selected
by a second resource, ensuring that tunnel and encryption key are
preserved.
- The second case tests a scenario where the pod CIDR is a /32 and the
legacy API upserts the pod IP (e.g. based on CEP or kvstore),
thereby shadowing the existing pod CIDR entry.
Signed-off-by: Sebastian Wicki <[email protected]>
The ipcache map now stores entries describing remote node pod CIDRs, including the tunnel endpoint and encryption key. Therefore, when a packet must be encapsulated or encrypted we can leverage the information from the lookup into the ipcache, avoiding the additional one into the tunnel map. Signed-off-by: Fabio Falzoi <[email protected]>
When handling the deletion of a node, we should take into account all the node allocations CIDRs, not just the primary one. Signed-off-by: Fabio Falzoi <[email protected]>
As a consequence of using the ipcache map to support tunneling, multi-pool IPAM is now compatible with it, thus lift the startup checks that stops the agent when both the features are enabled. Signed-off-by: Fabio Falzoi <[email protected]>
As it is now supported. Signed-off-by: Sebastian Wicki <[email protected]>
8067549 to
ef0deac
Compare
These are great questions!
Yes. The high-level source of information here is not changed. Previously, the node manager's
Yes, this the dependency is enforced by the agent. Endpoints are not regenerated until IPCache is fully populated. We wait for the K8s caches to be synced (and the initial IPCache revision to be realized) before we allow endpoint regeneration to start.
Since IPCache is re-created when the agent restarts, there are no stale pod CIDRs to clean up (at least with regards to IPCache - when it comes to the |
|
So I just had the realization that keeping the old tunnel map around for downgrade purposes is pretty pointless, since we are re-creating it every time agent restarts: Line 162 in c6f05a5 So it might make sense to re-add |
|
/test |
|
Putting this back into draft, as we have discovered that this needs a cluster-id aware upsert into IPCache. |
Awesome, thank you! This helps a lot to understand the bigger picture. And it also clarifies what happens on downgrade - as the IPCache is rebuilt, the PodCIDR entries naturally disappear again. |
|
Just wanted to add to this to say that this is exactly what I've been looking for. So I'll also wait for this one to merge, and in the meantime use it as a base for my testing. |
|
Superseded by #38483 |
This PR introduces support for tunnel routing with the Multi-Pool IPAM mode. This is achieved by removing the reliance on the tunnel map as a fallback mechanism and instead use Pod CIDR entries in IPCache that act as this fallback. The pod CIDR entries contain the node's tunnel endpoint and encryption key and have the
reserved:worldidentity (i.e. the same identity as the existing catch-all fallback, which would have been hit before this PR).The bulk of the work in this PR was done by @pippolo84. This PR is a replacement for his PR #37146 with a few notable changes:
node: Remove insertions and deletions to tunnel maphas been dropped. In other words, even though the tunnel map is no longer used in themainbranch, we still populate it. This addresses concerns with regards to downgrades.This change will eventually allow us to remove the tunnel map completely (ref #20170).