Skip to content

Replace LB-IPAM IP allocator#26488

Merged
dylandreimerink merged 5 commits intocilium:mainfrom
dylandreimerink:feature/new-lbipam-alloc
Oct 24, 2023
Merged

Replace LB-IPAM IP allocator#26488
dylandreimerink merged 5 commits intocilium:mainfrom
dylandreimerink:feature/new-lbipam-alloc

Conversation

@dylandreimerink
Copy link
Copy Markdown
Member

@dylandreimerink dylandreimerink commented Jun 26, 2023

This PR introduces a new IP allocator, replaces the existing IP allocator used by LB-IPAM with this new allocator. This removes the limitation on IPv6 ranges, the full range can now be used (only limited by the resources required to book keep). It also removes the reservation of the base and broadcast IP address making it possible to allocate single IPs. Lastly, support is now added for non-CIDR IP ranges such as x.x.x.100-x.x.x.200.

Fixes: #24351
Fixes: #22005
Fixes: #28255

Replace LB-IPAM IP allocator to remove limitations and enable additional features

@dylandreimerink dylandreimerink added kind/enhancement This would improve or streamline existing functionality. release-note/minor This PR changes functionality that users may find relevant to operating Cilium. dont-merge/wait-until-release Freeze window for current release is blocking non-bugfix PRs feature/lb-ipam labels Jun 26, 2023
@w7team
Copy link
Copy Markdown

w7team commented Jun 28, 2023

这个pr什么时候合并?

Translation by Google Translate:

when will this pr merge?

@dylandreimerink
Copy link
Copy Markdown
Member Author

when will this pr merge?

We are currently stabilizing for the v1.14 version cut. This PR will be undrafted as soon as we can merge feature into the main branch again. First actual release is expected to be v1.15-snapshot.1 whenever that will be.

@YutaroHayakawa
Copy link
Copy Markdown
Member

Thanks for working on this! Wrt the network and broadcast address, I'm a bit worrying about enabling it by default. MetalLB supports disabling those addresses with AvoidBuggyIPs flag for dealing with buggy devices that filter those IPs (https://metallb.universe.tf/configuration/_advanced_ipaddresspool_configuration/).

Our current implementation is sort of accidentally setting AvoidBuggyIPs: true by default. If we start to include "buggy" addresses by default, some users that have such buggy devices may be affected.

Do you think it does make sense to make it opt-in?

@dylandreimerink
Copy link
Copy Markdown
Member Author

Wrt the network and broadcast address, I'm a bit worrying about enabling it by default.

My original thinking was that, with ranges, it would not be difficult for users to declare a x.x.x.1-x.x.x.254 range if they do not want allocation of the base and broadcast IPs, keeping the CRD clean in that way

If we start to include "buggy" addresses by default, some users that have such buggy devices may be affected.

Yea, makes sense to me.

Do you think it does make sense to make it opt-in?

Yes, it makes sense to at least preserve existing behavior so users don't get surprised when updating. I think adding a AllowBuggyIPs (inverted) or a AvoidBuggyIPs which defaults to true isn't to difficult.

@dylandreimerink dylandreimerink force-pushed the feature/new-lbipam-alloc branch 2 times, most recently from 3501983 to 6fb87d6 Compare June 29, 2023 14:16
@joestringer joestringer removed the dont-merge/wait-until-release Freeze window for current release is blocking non-bugfix PRs label Jul 5, 2023
@dylandreimerink dylandreimerink force-pushed the feature/new-lbipam-alloc branch from 6fb87d6 to eeee6be Compare July 10, 2023 11:52
@dylandreimerink dylandreimerink force-pushed the feature/new-lbipam-alloc branch from eeee6be to e0c791a Compare July 21, 2023 12:05
@omniproc
Copy link
Copy Markdown

Wrt the network and broadcast address, I'm a bit worrying about enabling it by default.

My original thinking was that, with ranges, it would not be difficult for users to declare a x.x.x.1-x.x.x.254 range if they do not want allocation of the base and broadcast IPs, keeping the CRD clean in that way

If we start to include "buggy" addresses by default, some users that have such buggy devices may be affected.

Yea, makes sense to me.

Do you think it does make sense to make it opt-in?

Yes, it makes sense to at least preserve existing behavior so users don't get surprised when updating. I think adding a AllowBuggyIPs (inverted) or a AvoidBuggyIPs which defaults to true isn't to difficult.

It would be nice to get a propper error message why the first and last IP of a provided range is ignored in that case so user's don't need to wonder why cillium is not picking them up (unfortunately normal ppl don't calculate subnetmasks in their head on the fly).

@dylandreimerink
Copy link
Copy Markdown
Member Author

dylandreimerink commented Jul 26, 2023

It would be nice to get a propper error message why the first and last IP of a provided range is ignored in that case so user's don't need to wonder why cillium is not picking them up (unfortunately normal ppl don't calculate subnetmasks in their head on the fly).

I assume you mean in the status of a service when a service is specifically requesting a "buggy IP"? Or did you mean more broadly?

If we start to include "buggy" addresses by default, some users that have such buggy devices may be affected.

Yea, makes sense to me.

I am also contemplating making a small exception here. For /32 or /31 ranges, having 1 or 2 IPs, it doesn't make sense under any circumstance to reserve the first and last IP because they might be "buggy". So only /30 and bigger are effected by the setting (/126 for IPv6). That should also help with the UX. And it should not break to many existing setups since /31 and /32 ranges currently are unusable anyway

@omniproc
Copy link
Copy Markdown

omniproc commented Jul 30, 2023

I assume you mean in the status of a service when a service is specifically requesting a "buggy IP"? Or did you mean more broadly?

I mean in the status of the pool CRD when a CIDR was specified and the user tries to use the first or the last IP of that pool (by setting the LoadBalancerIP explicitly in a service). Also logs in the operator should prbl. log an info that the first and last IP of the CIDR are ignored by default.
In contrast to that I think "range" notation is more useful then flags not not ignore the first and last IP of a provided CIDR range. So, for example:

  • CIDR: x.x.x.x/30 -> first and last IP are ignored because CIDR notation was used
  • Range: x.x.x.x - x.x.x.y -> first and last IP are not ignored, it's the user's responsibility to not mess up the (human readable friendly) range

That's how other projects, e.g. kubevip, do it. See config map example here https://kube-vip.io/docs/usage/cloud-provider/

@dylandreimerink dylandreimerink force-pushed the feature/new-lbipam-alloc branch from e0c791a to 1660d95 Compare August 8, 2023 12:31
@dylandreimerink
Copy link
Copy Markdown
Member Author

/test

@dylandreimerink
Copy link
Copy Markdown
Member Author

So with regards to dodgy IPs I have settled on the following which I think is the best of all worlds:

  • Dodgy IPs are reserved by default on CIDR ranges
  • /32, /31 IPv4 and /128, /127 IPv6 CIDRs are fully available regardless of the Dodgy IP setting
  • Start+end IP ranges have no reserved IPs

With that I feel like this is where I want it to be, making ready for review.

@dylandreimerink dylandreimerink marked this pull request as ready for review August 8, 2023 13:50
@dylandreimerink dylandreimerink requested review from a team as code owners August 8, 2023 13:50
@dylandreimerink dylandreimerink requested review from a user, nebril and ysksuzuki August 8, 2023 13:50
@dylandreimerink dylandreimerink force-pushed the feature/new-lbipam-alloc branch from 6c60c92 to b3b5c0d Compare October 17, 2023 07:10
@dylandreimerink dylandreimerink removed the dont-merge/discussion A discussion is ongoing and should be resolved before merging, regardless of reviews & tests status. label Oct 17, 2023
@dylandreimerink dylandreimerink force-pushed the feature/new-lbipam-alloc branch from b3b5c0d to 0cf7036 Compare October 17, 2023 07:37
@dylandreimerink
Copy link
Copy Markdown
Member Author

/test

Currently LB-IPAM uses cilium/ipam/service/ipallocator to allocate its
IPs. While this is a decent allocator for most purposes, it is holding
LB-IPAM back in certain aspects.

This allocator aims to address the limitations encountered with the
current allocator such as:

* The new allocator supports non-CIDR ranges (x.x.x.100-x.x.x.200)
* The new allocator does not reserve the base and broadcast IPs
* The new allocator is not limited to 65536 IPs for IPv6
* The new allocator can associate arbitrary data with IPs via generics
  not just a boolean in-use or not-in-use

The removal of these limitations should allow us to make LB-IPAM more
user friendly and to implement some additional features.

Signed-off-by: Dylan Reimerink <[email protected]>
This commit replaces the underlying IP allocator used by LB-IPAM from
the cilium/ipam/service/ipallocator package to the new pkg/ipalloc
package.

This is just a 1-to-1 replacement without making use of any of the new
features of this allocator. The only change in behavior is that the
base and broadcast addresses are no longer reserved by default.

Signed-off-by: Dylan Reimerink <[email protected]>
Now that we have the new IP allocator, we can allocate non-CIDR ranges.
The ideal new syntax looks like:

```
[...]
spec:
  blocks:
  - cidr: "10.0.0.0/24"
  - start: "20.0.0.10"
    stop: "20.0.0.30"
  - start: "30.1.2.3"
```

The `cidrs` list has been renamed to blocks since we are now dealing
with blocks of IPs not just CIDRs anymore. The old `cidrs` field is
still supported and takes the same elements, so just an alias for
backwards compatibility.

Signed-off-by: Dylan Reimerink <[email protected]>
LB-IPAM was still reserving all IPs in a /32 or /31 IPv4 CIDR if
`allowFirstLastIPs` is `no`. This behavior doesn't make any sense, so
this commit makes it so we only apply this behavior for CIDRs with more
than two IPs.

Signed-off-by: Dylan Reimerink <[email protected]>
Added a new chapter to the LB-IPAM docs to explain the reservation
of the first and list IPs of CIDRs and how to allow them to be used.

Signed-off-by: Dylan Reimerink <[email protected]>
@dylandreimerink dylandreimerink force-pushed the feature/new-lbipam-alloc branch from 0cf7036 to 648ec65 Compare October 19, 2023 08:01
@dylandreimerink
Copy link
Copy Markdown
Member Author

/test

@clcc2019
Copy link
Copy Markdown

Hope to be merged into the new version as soon as possible, this is important to us

@gandro
Copy link
Copy Markdown
Member

gandro commented Oct 23, 2023

Hope to be merged into the new version as soon as possible, this is important to us

Please note as this is a new feature with API changes, it will land no earlier than Cilium 1.15 (i.e. it will not be backported to Cilium 1.14 in accordance with our backporting policy).

@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Oct 23, 2023
@dylandreimerink dylandreimerink merged commit 84471cc into cilium:main Oct 24, 2023
@kenlasko
Copy link
Copy Markdown

kenlasko commented Nov 4, 2023

FWIW, I did extensive testing with this new feature in 1.15-pre.2 and it works exactly as expected. Great work everyone!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature/lb-ipam kind/enhancement This would improve or streamline existing functionality. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/minor This PR changes functionality that users may find relevant to operating Cilium.

Projects

None yet