Skip to content

Excess IP release handshake 1.8 backport#81

Merged
hemanthmalla merged 8 commits into1.8.x-ddfrom
hemanth.malla/release_handshake_1.8_backport
Jan 6, 2022
Merged

Excess IP release handshake 1.8 backport#81
hemanthmalla merged 8 commits into1.8.x-ddfrom
hemanth.malla/release_handshake_1.8_backport

Conversation

@hemanthmalla
Copy link
Copy Markdown

@hemanthmalla hemanthmalla commented Dec 13, 2021

Backport of cilium#17939, cilium#18217 and cilium#18330 to cilium 1.8.7.

See commit and PR description for more details.

Currently there's a potential 15 sec delay in updating ciliumnode CRD after IP
allocation to a pod, meanwhile the operator can determine that a node has
excess IPs and release the IP causing pod connectivity issues.

A new operator flag `excess-ip-release-delay` is added to control how long
operator should wait before considering an IP for release(defaults to 180 secs).
This is done to better handle IP reuse during rollouts. Operator and agent use
a new map in cilium node status .status.ipam.release-ips to exchange information
during the handshake.

Fixes: cilium#13412

Signed-off-by: Hemanth Malla <[email protected]>
After the handshake is complete and the operator is done releasing an IP,
CiliumNode status (release-ips) and spec (pool) are updated in two
consecutive requests. There's a tiny window between the two updates where
the entry is removed from .status.ipam.release-ips but the IP is still
present in spec.ipam.pool. It was possible that the IP could be allocated
between these requests.

This commit introduces a new state called released to deal with this.
Now agent removes the entry from release-ips only when the IP was
removed from .spec.ipam.pool as well.

Signed-off-by: Hemanth Malla <[email protected]>
Copy link
Copy Markdown

@JulienBalestra JulienBalestra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🙏

* Fix for blocked state transition from ready-for-release to released
* Fix for unnecessary updates between agent and operator during handshake

Signed-off-by: Hemanth Malla <[email protected]>
crdAllocator.Allocate() aquires lock on allocator first and then on
nodestore. But updateLocalNodeResource() acquires locks in the opposite
order. This commit releases nodestore lock before acquiring allocator lock
to avoid potential deadlocks due to inconsistent lock ordering.

Signed-off-by: Hemanth Malla <[email protected]>
@hemanthmalla hemanthmalla force-pushed the hemanth.malla/release_handshake_1.8_backport branch from 46035d2 to e635317 Compare December 21, 2021 18:27
… node

Imagine a scenario where a node has 2 unused IPs and pre-allocate set to
1. Let's say one of the IPs is in the middle of a handshake and a new
pod is scheduled on the node. The other unused IP would be allocated to
the pod. Now, when the operator re-evaluates, the node is no longer
considered to be in excess. Without this commit, the operator does not
act further on IPs in this state. This results in a scenario where no
new IPs are allocated to the node and agent cannot allocate the unused IPs
because they're in the middle of a handshake.

Signed-off-by: Hemanth Malla <[email protected]>
@hemanthmalla hemanthmalla force-pushed the hemanth.malla/release_handshake_1.8_backport branch from 8a60d01 to 36a4268 Compare December 30, 2021 21:22
@hemanthmalla hemanthmalla merged commit 7688cbc into 1.8.x-dd Jan 6, 2022
@HadrienPatte HadrienPatte deleted the hemanth.malla/release_handshake_1.8_backport branch March 26, 2025 13:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants