Skip to content

MaintainIPPool refactor and support for aborting release handshake#18330

Merged
joestringer merged 3 commits intocilium:masterfrom
DataDog:hemanthmalla/refactor_abort
Jan 6, 2022
Merged

MaintainIPPool refactor and support for aborting release handshake#18330
joestringer merged 3 commits intocilium:masterfrom
DataDog:hemanthmalla/refactor_abort

Conversation

@hemanthmalla
Copy link
Copy Markdown
Member

@hemanthmalla hemanthmalla commented Dec 23, 2021

Imagine a scenario where a node has 2 unused IPs and pre-allocate set to one. Let's say one of the IPs is in the middle of a handshake and a new pod is scheduled on the node. The other unused IP would be allocated to the pod. Now, when the operator re-evaluates, the node is no longer considered to be in excess. Without this commit, the operator does not act further on IPs in this state. This results in a scenario where no new IPs are allocated to the node and agent cannot allocate the unused IPs because they're in the middle of a handshake.

Also refactors IP release and allocate functionality out of maintainIPPool()

Fix for a bug where unused IPs on the node cannot be allocated when IP release handshake is enabled. Adds support for aborting IP release, if the node doesn't have excess anymore.

@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Dec 23, 2021
@hemanthmalla hemanthmalla force-pushed the hemanthmalla/refactor_abort branch from 6621042 to 22315d9 Compare December 23, 2021 12:13
@hemanthmalla hemanthmalla marked this pull request as ready for review December 23, 2021 12:21
@hemanthmalla hemanthmalla requested review from a team and twpayne December 23, 2021 12:21
@gandro gandro added area/eni Impacts ENI based IPAM. needs-backport/1.11 release-note/bug This PR fixes an issue in a previous release of Cilium. labels Dec 23, 2021
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Dec 23, 2021
Copy link
Copy Markdown
Member

@gandro gandro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! Thank you.

Only some minor nits, nothing blocking.

@hemanthmalla hemanthmalla force-pushed the hemanthmalla/refactor_abort branch from 22315d9 to 53f626a Compare December 23, 2021 14:22
@hemanthmalla
Copy link
Copy Markdown
Member Author

@gandro Thanks for the review 🙇
Incorporated all the feedback, updated the commit with changes.

@hemanthmalla
Copy link
Copy Markdown
Member Author

/ci-eks

@hemanthmalla
Copy link
Copy Markdown
Member Author

/test-runtime

@gandro
Copy link
Copy Markdown
Member

gandro commented Dec 23, 2021

/test-runtime

… node

Imagine a scenario where a node has 2 unused IPs and pre-allocate set to
1. Let's say one of the IPs is in the middle of a handshake and a new
pod is scheduled on the node. The other unused IP would be allocated to
the pod. Now, when the operator re-evaluates, the node is no longer
considered to be in excess. Without this commit, the operator does not
act further on IPs in this state. This results in a scenario where no
new IPs are allocated to the node and agent cannot allocate the unused IPs
because they're in the middle of a handshake.

Signed-off-by: Hemanth Malla <[email protected]>
@hemanthmalla hemanthmalla force-pushed the hemanthmalla/refactor_abort branch from 53f626a to ef5ed6b Compare December 31, 2021 16:17
@twpayne twpayne added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Jan 4, 2022
Copy link
Copy Markdown
Member

@christarazi christarazi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. The changes LGTM, could you update the 2nd commit msg with the motivation behind the refactor?

@christarazi christarazi removed the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Jan 5, 2022
@joestringer
Copy link
Copy Markdown
Member

Given the current title of the PR plus the labels, if we merge & backport & release this as part of v1.11.1, then the PR will show up in the release notes under "Bugfixes" as MaintainIPPool refactor and support for aborting release handshake. This doesn't sound like a bugfix. Could you improve the title or add a ```release-note::...``` section into the PR description that more succinctly describes the bugfix? Maybe something like "Fix bug where Cilium cannot allocate unused IPs from the ENI IPAM pool"?

…ol()

With the addition of IP release handhake, maintainIPPool() became too
long and not very readable. So, moving release and allocate logic into
their own functions.

Signed-off-by: Hemanth Malla <[email protected]>
@hemanthmalla hemanthmalla force-pushed the hemanthmalla/refactor_abort branch from ef5ed6b to ef6b386 Compare January 5, 2022 21:19
@hemanthmalla
Copy link
Copy Markdown
Member Author

@christarazi @joestringer Updated the commit message and added a release note.

@joestringer
Copy link
Copy Markdown
Member

I see that this was previously marked ready-to-merge and you've only updated the commit message, so this should be good to merge now. Thanks for the fix!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/eni Impacts ENI based IPAM. backport-done/1.11 The backport for Cilium 1.11.x for this PR is done. release-note/bug This PR fixes an issue in a previous release of Cilium.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants