Skip to content

bgp: withdraw route when svc has no endpoints#40717

Merged
aditighag merged 10 commits intocilium:mainfrom
oblazek:ob-bgp-no-endpoints
Sep 18, 2025
Merged

bgp: withdraw route when svc has no endpoints#40717
aditighag merged 10 commits intocilium:mainfrom
oblazek:ob-bgp-no-endpoints

Conversation

@oblazek
Copy link
Copy Markdown
Contributor

@oblazek oblazek commented Jul 25, 2025

We have services with either externalTrafficPolicy or internalTrafficPolicy. We have discussed in the issue #40438 that according to the docs - externalTrafficPolicy applies to loadBalancerIP and/or externalIP and internalTrafficPolicy applies to clusterIP. There was also a comment that it doesn't matter where the client is located as e.g. loadBalancerIP can also be accessed from within the cluster (from e.g. pod) even though it's not that common.
When endpoints are taken into account the value of the externalTP or internalTP matters. In BGP world when either is set to Local, the IP (LBIP, externalIP or clusterIP) is announced only if the node hosts an endpoint. When eTP/iTP is set to the other value - Cluster, then all nodes will announce the IP even though they do not host any relevant pod/endpoint.

In the current codebase the code doesn't handle properly when service has zero endpoints and the eTP/iTP is set to Cluster. In this case only for Local the node will stop announcing the IP and the packet targetted to the service will be dropped here. The problem is that this doesn't take eTP/iTP value into account and dropps the packet in both cases.

We've agreed that for eTP/iTP == Cluster
we will not drop the packet, but will forward it to the stack. Aditionally our cilium bgp subsystem will stop announcing the IP so that setups that use e.g. anycast can benefit from this (and traffic will be forwarded to e.g. another DC which announces the same VIP).

This PR handles this scenario.

TLDR:

  • add a new flag which defaults to true, so that current behavior is not changed
  • keep track of services when corresponding epDiffstore objects has been updated
  • withdraw a route when service has 0 endpoint newly for eTP/iTP=Cluster
  • socklb - do not drop packets when service has 0 endpoints and eTP/iTP=Cluster

Fixes: #40438

cc @rastislavs

bgp: optionally withdraw route when service has 0 endpoints to allow balancing to a different DC/cluster (anycast)

@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Jul 25, 2025
@oblazek oblazek force-pushed the ob-bgp-no-endpoints branch 2 times, most recently from e028b82 to 20c7c7a Compare July 28, 2025 07:08
@oblazek
Copy link
Copy Markdown
Contributor Author

oblazek commented Jul 28, 2025

/test

@rastislavs
Copy link
Copy Markdown
Contributor

@oblazek I think you will also need to update tests in pkg/bgpv1/test/testdata to provide endpointslices, for example as in /pkg/bgpv1/test/testdata/svc-traffic-policy.txtar

@oblazek
Copy link
Copy Markdown
Contributor Author

oblazek commented Jul 28, 2025

thanks :), will take a look

@oblazek
Copy link
Copy Markdown
Contributor Author

oblazek commented Jul 31, 2025

/test

@oblazek oblazek force-pushed the ob-bgp-no-endpoints branch 2 times, most recently from d18da70 to de09a1e Compare August 3, 2025 19:15
@oblazek
Copy link
Copy Markdown
Contributor Author

oblazek commented Aug 3, 2025

/test

@oblazek oblazek force-pushed the ob-bgp-no-endpoints branch 2 times, most recently from 4e146fd to b214001 Compare August 4, 2025 10:13
@oblazek
Copy link
Copy Markdown
Contributor Author

oblazek commented Aug 4, 2025

/test

@oblazek
Copy link
Copy Markdown
Contributor Author

oblazek commented Aug 4, 2025

I seem to have broken adverts_tests.go tested with e.g. PRIVILEGED_TESTS=true go test -exec "sudo -E" . -test.run TestPrivilegedLBEgressAdvertisementWithLoadBalancerIP. Any idea what could be the reason @rastislavs ?

@rastislavs
Copy link
Copy Markdown
Contributor

I seem to have broken adverts_tests.go tested with e.g. PRIVILEGED_TESTS=true go test -exec "sudo -E" . -test.run TestPrivilegedLBEgressAdvertisementWithLoadBalancerIP. Any idea what could be the reason @rastislavs ?

I guess you will need to set BGPNoEndpointsRoutable to true in this test hive setup for BGPv1 tests:

f.cells = []cell.Cell{
cell.Config(k8sPkg.DefaultConfig),
// service
cell.Provide(k8sPkg.ServiceResource),
// endpoints
cell.Provide(k8sPkg.EndpointsResource),
// CiliumLoadBalancerIPPool
cell.Provide(k8sPkg.LBIPPoolsResource),
// cilium node
cell.Provide(func(lc cell.Lifecycle, c k8sClient.Clientset) daemon_k8s.LocalCiliumNodeResource {
store := resource.New[*cilium_api_v2.CiliumNode](
lc, utils.ListerWatcherFromTyped[*cilium_api_v2.CiliumNodeList](
c.CiliumV2().CiliumNodes(),
),
)
f.ciliumNode = store
return store
}),
// Provide the mocked client cells directly
cell.Provide(func() k8sClient.Clientset {
return f.fakeClientSet
}),
// Register the tables with the StateDB
registerTablesCell,
// Provide route table
cell.Provide(func() statedb.Table[*tables.Route] {
return routeTable.ToTable()
}),
// Provide device table
cell.Provide(func() statedb.Table[*tables.Device] {
return deviceTable.ToTable()
}),
// daemon config
cell.Provide(func() *option.DaemonConfig {
return &option.DaemonConfig{
EnableBGPControlPlane: conf.bgpEnable,
BGPSecretsNamespace: "bgp-secrets",
IPAM: conf.ipam,
}
}),
// local bgp state for inspection
cell.Invoke(func(bgp *agent.Controller) {
f.bgp = bgp
}),
cell.Invoke(func(m agent.BGPRouterManager) {
m.(*manager.BGPRouterManager).DestroyRouterOnStop(true) // fully destroy GoBGP server on Stop()
}),
metrics.Cell,
bgpv1.Cell,
}
f.hive = hive.New(f.cells...)

BTW, I should have said it earlier, but not sure if adding support for this to BGPv1 adds much value as it is already deprecated, probably doing this just for BGPv2 would suffice. But as you have already done all that work, lets do that, at least it will be in sync until we remove BGPv1.

@oblazek oblazek force-pushed the ob-bgp-no-endpoints branch from b214001 to 560ba17 Compare August 5, 2025 21:42
@oblazek
Copy link
Copy Markdown
Contributor Author

oblazek commented Aug 5, 2025

/test

@oblazek oblazek force-pushed the ob-bgp-no-endpoints branch from 560ba17 to b67a0f4 Compare August 6, 2025 05:10
@oblazek
Copy link
Copy Markdown
Contributor Author

oblazek commented Aug 6, 2025

/test

@oblazek
Copy link
Copy Markdown
Contributor Author

oblazek commented Aug 6, 2025

/ci-verifier

@oblazek
Copy link
Copy Markdown
Contributor Author

oblazek commented Aug 6, 2025

ci-verifier fail is not relevant, failing on:

go: golang.org/dl/go1.24.5@latest: golang.org/[email protected]: verifying module: golang.org/[email protected]: Get "https://sum.golang.org/lookup/golang.org/[email protected]": dial tcp: lookup sum.golang.org on 127.0.0.53:53: server misbehaving

@oblazek oblazek marked this pull request as ready for review August 6, 2025 09:44
@oblazek oblazek requested review from a team as code owners August 6, 2025 09:44
@oblazek oblazek requested a review from bimmlerd August 6, 2025 09:44
@oblazek oblazek force-pushed the ob-bgp-no-endpoints branch from a6f99d3 to f1aa910 Compare September 17, 2025 06:09
@oblazek oblazek force-pushed the ob-bgp-no-endpoints branch from f1aa910 to 3ce87b3 Compare September 17, 2025 07:14
@oblazek
Copy link
Copy Markdown
Contributor Author

oblazek commented Sep 17, 2025

/test

Previously it was not possible to access services with 0 endpoints via
socketLB and all packets were dropped with EHOSTUNREACH.

With these changes it is possible to access services with 0 endpoints
which can resolve in traffic being forwarded to another DC that can
serve the same service (e.g. one with the same loadbalancerIP).

Signed-off-by: Ondrej Blazek <[email protected]>
* update helm to include enableNoServiceEndpointsRoutable top level flag

Signed-off-by: Ondrej Blazek <[email protected]>
* update lxc, nodeport logic around NO_SERVICE_BACKEND with new flag

Signed-off-by: Ondrej Blazek <[email protected]>
@oblazek oblazek force-pushed the ob-bgp-no-endpoints branch from 3ce87b3 to 896702f Compare September 17, 2025 11:11
@oblazek
Copy link
Copy Markdown
Contributor Author

oblazek commented Sep 17, 2025

/test

@oblazek
Copy link
Copy Markdown
Contributor Author

oblazek commented Sep 17, 2025

ci-integration seems to be failing on FAIL │ 3.26s │ github.com/cilium/cilium/pkg/loadbalancer/redirectpolicy. but that runs normally on localhost 🤔

@oblazek
Copy link
Copy Markdown
Contributor Author

oblazek commented Sep 17, 2025

/ci-integration

@oblazek
Copy link
Copy Markdown
Contributor Author

oblazek commented Sep 17, 2025

/ci-ipsec-e2e

@oblazek
Copy link
Copy Markdown
Contributor Author

oblazek commented Sep 17, 2025

/ci-e2e-upgrade

@aditighag
Copy link
Copy Markdown
Member

ci-integration seems to be failing on FAIL │ 3.26s │ github.com/cilium/cilium/pkg/loadbalancer/redirectpolicy. but that runs normally on localhost 🤔

Can you file an issue per the documented steps - https://docs.cilium.io/en/stable/contributing/testing/ci/#id1?

@aditighag aditighag removed the dont-merge/needs-rebase This PR needs to be rebased because it has merge conflicts. label Sep 17, 2025
@aditighag aditighag enabled auto-merge September 17, 2025 21:50
@aditighag
Copy link
Copy Markdown
Member

@oblazek FYI -- merging is blocked as there were some unresolved comments. I resolved all of them to unblock merging, including this one as you mentioned addressing that as a follow-up - #40717 (comment).

@aditighag aditighag disabled auto-merge September 17, 2025 21:54
@aditighag aditighag enabled auto-merge September 17, 2025 21:54
@giorio94 giorio94 added dont-merge/bad-bot To prevent MLH from marking ready-to-merge. and removed dont-merge/bad-bot To prevent MLH from marking ready-to-merge. labels Sep 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/minor This PR changes functionality that users may find relevant to operating Cilium.

Projects

No open projects
Status: Released

Development

Successfully merging this pull request may close these issues.

Cilium LoadBalancer w/Local and BGP drops traffic on no endpoints

10 participants