Skip to content

IPAM: handle subnets bigger than "/64"#49223

Merged
thaJeztah merged 4 commits intomoby:masterfrom
robmry:ipam_with_more_bits
Jan 21, 2025
Merged

IPAM: handle subnets bigger than "/64"#49223
thaJeztah merged 4 commits intomoby:masterfrom
robmry:ipam_with_more_bits

Conversation

@robmry
Copy link
Copy Markdown
Contributor

@robmry robmry commented Jan 7, 2025

- What I did

The default IPAM allocator is unable to represent subnets larger than 64-bits (subnets with a smaller prefix), because it uses a Bitmap that's limited to 64-bits.

When it's used to represent a 64-bit subnet, the top address can't be allocated (because bitmap.Bitmap is initialised with the number of bits it needs to represent in a uint64, so it's one short).

The rest of the daemon doesn't know about these limitations, so strange things happen when a large IPv6 subnet is used. No errors are reported, addresses/subnets are just set up incorrectly. The IPAM code calculates offsets into the bitmap itself, details it shouldn't need to understand and, because it's working on offsets into a range it doesn't always notice when it's asked to set a bit outside the range (see the notes about "offset" in #48319).

It's unusual to need a big subnet but, for example, it may be useful for modelling an ISP network, or an ISP's gateway may be in a "/56" subnet that's outside a 64-bit range used by hosts.

- How I did it

  • Don't increment "unselected" in Bitmap when clearing a 0
    • Bug fix in the bitmap.Bitmap code
  • Add addrset.AddrSet to track a set of IP addresses
    • A new internal package to track IP addresses.
    • It owns a bitmap.Bitmap for each range of up to 63-bits that has a bit set, within the subnet it's asked to represent.
  • Use addrset.AddrSet instead of bitmap.Bitmap in IPAM

- How to verify it

New unit tests for AddrSet, plus existing tests (including the two "XFAIL" tests that now pass).

- Description for the changelog

- IPAM now handles subnets bigger than "/64".

@robmry robmry added area/networking Networking kind/bugfix PR's that fix bugs area/networking/ipv6 Networking area/networking/ipam Networking labels Jan 7, 2025
@robmry robmry self-assigned this Jan 7, 2025
@robmry robmry force-pushed the ipam_with_more_bits branch 6 times, most recently from 93754fa to cc0af4d Compare January 9, 2025 11:28
@robmry robmry added this to the 28.0.0 milestone Jan 9, 2025
@robmry robmry marked this pull request as ready for review January 9, 2025 11:52
@robmry robmry changed the title WIP - handle subnets bigger than "/64" IPAM: handle subnets bigger than "/64" Jan 9, 2025
Comment thread libnetwork/bitmap/sequence.go Outdated
Comment thread libnetwork/bitmap/sequence_test.go
Comment thread libnetwork/ipams/defaultipam/allocator_test.go
Comment thread libnetwork/internal/addrset/addrset.go Outdated
Comment thread libnetwork/internal/addrset/addrset.go
@robmry robmry force-pushed the ipam_with_more_bits branch 2 times, most recently from 70ee756 to 8924145 Compare January 9, 2025 18:37
Comment thread libnetwork/internal/addrset/addrset.go Outdated
// Add adds address addr to the set. If addr is already in the set, it returns a
// wrapped [ErrAllocated]. If addr is not in the set's address range, it returns
// an error.
func (as AddrSet) Add(addr netip.Addr) error {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although technically we can get away with value receivers while mutating the map, Go consensus is to use a pointer receiver on methods which mutate the receiver for clarity of API.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to pointers.

Comment thread libnetwork/internal/addrset/addrset.go Outdated
// no addresses are available, it returns a wrapped [ErrNotAvailable].
//
// When serial=true, the set is scanned starting from the address following
// the address most recently set by [AddrSet.SetAny] or [AddrSet.SetAnyInRange].
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// the address most recently set by [AddrSet.SetAny] or [AddrSet.SetAnyInRange].
// the address most recently set by [AddrSet.AddAny] or [AddrSet.AddAnyInRange].

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Comment thread libnetwork/internal/addrset/addrset.go Outdated
Comment on lines +76 to +79
// If the pool has maxBitsPerBitmap or fewer, the key for the only bitmap is the pool's address.
// If there's more than one bitmap, they all have maxBitsPerBitmap. So, assume there's a free
// address in the bitmap keyed on the pool's address (no need to search other bitmaps, or work
// out if more bitmaps could be created).
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clarify that this assumption is unsound, with rationale

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "unsound" assumption would be that anything in libnetwork is capable of tracking, or needs to track, 2^63 different IP addresses (9,223,372,036,854,775,808 addresses). Let alone more than that.

But, for the overly ambitious or concerned, I've reworded and included it in the godoc comment.

}

// AddAny adds an arbitrary address to the set, and returns that address. Or, if
// no addresses are available, it returns a wrapped [ErrNotAvailable].
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Document the limitations of this function. An ErrNotAvailable returned from this function does not necessarily mean that all host IDs in the prefix have been allocated.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above.

Comment thread libnetwork/internal/addrset/addrset.go Outdated
// If ipr is not fully contained within the set's range, it returns an error.
//
// When serial=true, the set is scanned starting from the address following
// the address most recently set by [AddrSet.SetAny] or [AddrSet.SetAnyInRange].
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// the address most recently set by [AddrSet.SetAny] or [AddrSet.SetAnyInRange].
// the address most recently set by [AddrSet.AddAny] or [AddrSet.AddAnyInRange].

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@robmry robmry force-pushed the ipam_with_more_bits branch 2 times, most recently from daa5440 to 9a2d5df Compare January 15, 2025 12:19
@robmry
Copy link
Copy Markdown
Contributor Author

robmry commented Jan 15, 2025

I've added an extra commit, just to clarify the code that reserves network/broadcast addresses.

Copy link
Copy Markdown
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Copy Markdown
Member

@akerouanton akerouanton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small nit, otherwise LGTM

// Add an address that's already present. Expect an error.
err = as.Add(netip.MustParseAddr("10.20.255.255"))
assert.Check(t, is.ErrorIs(err, ErrAllocated))
assert.Check(t, is.ErrorIs(err, ErrAllocated))
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
assert.Check(t, is.ErrorIs(err, ErrAllocated))

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, thank you - fixed.

The default IPAM allocator is unable to represent subnets larger than
64-bits (subnets with a smaller prefix), because it uses a Bitmap
that's limited to 64-bits.

When it's used to represent a 64-bit subnet, the top address can't
be allocated (because bitmap.Bitmap is initialised with the number
of bits it needs to represent in a uint64, so it's one short).

The rest of the daemon doesn't know about these limitations, so
strange things happen when a large IPv6 subnet is used.

No errors are reported, addresses/subnets are just set up incorrectly.
The IPAM code calculates offsets into the bitmap itself, details it
shouldn't need to understand and, because it's working on offsets
into a range it doesn't always notice when it's asked to set a bit
outside the range.

It's unusual to need a big subnet but, for example, it may be useful
for modelling an ISP network, or an ISP's gateway may be in a "/56"
subnet that's outside a 64-bit range used by hosts.

So, use addrset.AddrSet instead of bitmap.Bitmap.

Signed-off-by: Rob Murray <[email protected]>
The first address in an IPv6 range was reserved, but that wasn't
clear from comments or the code.

Signed-off-by: Rob Murray <[email protected]>
@robmry robmry force-pushed the ipam_with_more_bits branch from 9a2d5df to 366f2b5 Compare January 20, 2025 16:49
@thaJeztah thaJeztah merged commit 29e2353 into moby:master Jan 21, 2025
@robmry robmry deleted the ipam_with_more_bits branch January 21, 2025 10:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment