Is there an existing issue for this?
Version
equal or higher than v1.16.0 and lower than v1.17.0
What happened?
When configuring CiliumBGPAdvertisement with overlapping sector-based matches, the last sequential match is used and previous matches are ignored. There are no errors thrown. Based on https://github.com/cilium/cilium/blob/main/pkg/bgpv1/manager/reconcilerv2/service.go#L295 and my own log messages, overlapping matches are carried through to this loop and each previous match is overwritten by the next.
I believe an assumption was made that the administrator would only define a single match for each combination of advertisementType and selector. The lack of an error message leads me to believe that I am trying to configure Cilium for a use case that wasn't considered, rather than an intentionally not supported one.
As a cloud operator, I have a use case that would ideally be served by setting a dynamic combination of BGP communities based on the workload deployed. As an example, consider:
- Matching the label "customer" which sets the BGP Community
100:100.
- Matching the label "customer-vpc" which sets the BGP Community
110:110.
- Matching the label "backbone" which specifies the BGP Community
200:200.
In this example, all advertisements from the customer's workload include at minimum BGP Community 100:100. Some advertisements may contain BGP Communities 100:100 and 110:110, while others may contain 100:100 and 200:200.
I would like to modify Cilium to permit overlapping matches for CiliumBGPAdvertisement.advertisements with the following logic:
- When overlapping entries define
communities (standard, wellKnown, or large), treat it as an append operation. All matches within the same type (standard vs. large) are combined.
- When overlapping entries define
localPreference, regardless of being accompanied by communities or in isolation, if their localPreference values are different -- throw an error and stop processing the CR.
Example:
- advertisementType: "Service"
service:
addresses:
- ClusterIP
selector:
matchExpressions:
- {key: somekey, operator: NotIn, values: ['never-used-value']}
attributes:
communities:
standard: [ "101:101" ]
- advertisementType: "Service"
service:
addresses:
- ClusterIP
selector:
matchExpressions:
- {key: somekey, operator: NotIn, values: ['never-used-value']}
attributes:
communities:
standard: [ "202:202" ]
When processed by Cilium, this is the same as:
- advertisementType: "Service"
service:
addresses:
- ClusterIP
selector:
matchExpressions:
- {key: somekey, operator: NotIn, values: ['never-used-value']}
attributes:
communities:
standard: [ "101:101", "202:202" ]
How can we reproduce the issue?
Cilium Configuration:
---
apiVersion: cilium.io/v2alpha1
kind: CiliumBGPClusterConfig
metadata:
name: cilium-bgp-cp
spec:
nodeSelector:
matchLabels:
kubernetes.io/hostname: bgp-cplane-dev-v4-control-plane
bgpInstances:
- name: "control-plane"
localASN: 65001
peers:
- name: "frr"
peerASN: 65000
peerAddress: 10.0.1.1
peerConfigRef:
name: "cilium-cp-frr-peer"
---
apiVersion: cilium.io/v2alpha1
kind: CiliumBGPClusterConfig
metadata:
name: cilium-bgp-worker
spec:
nodeSelector:
matchLabels:
kubernetes.io/hostname: bgp-cplane-dev-v4-worker
bgpInstances:
- name: "worker"
localASN: 65002
peers:
- name: "frr"
peerASN: 65000
peerAddress: 10.0.2.1
peerConfigRef:
name: "cilium-cp-frr-peer"
---
apiVersion: cilium.io/v2alpha1
kind: CiliumBGPPeerConfig
metadata:
name: "cilium-cp-frr-peer"
spec:
gracefulRestart:
enabled: true
families:
- afi: ipv4
safi: unicast
advertisements:
matchLabels:
advertise: "bgp"
---
apiVersion: cilium.io/v2alpha1
kind: CiliumBGPAdvertisement
metadata:
name: "bgp-advertisements"
labels:
advertise: "bgp"
spec:
advertisements:
- advertisementType: "Service"
service:
addresses:
- ClusterIP
selector:
matchExpressions:
- {key: somekey, operator: NotIn, values: ['never-used-value']}
attributes:
communities:
standard: [ "101:101" ]
- advertisementType: "Service"
service:
addresses:
- ClusterIP
selector:
matchExpressions:
- {key: somekey, operator: NotIn, values: ['never-used-value']}
attributes:
communities:
standard: [ "202:202" ]
- advertisementType: "Service"
service:
addresses:
- ClusterIP
selector:
matchExpressions:
- {key: somekey, operator: NotIn, values: ['never-used-value']}
attributes:
communities:
standard: [ "303:303" ]
---
Cilium Version
$ cilium version
cilium-cli: v0.16.7-40-g9316d0ac compiled with go1.22.2 on linux/amd64
cilium image (default): v1.15.5
cilium image (stable): v1.16.3
cilium image (running): 1.17.0-dev
I am running a local build of Cilium using the BGP Control Plane's Kind dev environment:
git log
commit d32ff87a6aa69b64f027fae40083d9148d283d1e (HEAD -> dswaffordcw/bgp-adverts-additive, upstream/main, main)
Author: André Martins <[email protected]>
Date: Thu Sep 12 11:07:02 2024 +0200
Kernel Version
Linux hostname 6.5.0-26-generic #26~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Tue Mar 12 10:22:43 UTC 2 x86_64 x86_64 x86_64 GNU/Linux
Kubernetes Version
$ k version
Client Version: version.Info{Major:"1", Minor:"20+", GitVersion:"v1.20.15-enhanced-describe-dirty", GitCommit:"ac2e2baa7d4039cc4c68f2e869e4edbe2d60b305", GitTreeState:"dirty", BuildDate:"2023-03-02T00:33:46Z", GoVersion:"go1.20.1", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"30", GitVersion:"v1.30.0", GitCommit:"7c48c2bd72b9bf5c44d21d7338cc7bea77d0ad2a", GitTreeState:"clean", BuildDate:"2024-05-13T22:00:36Z", GoVersion:"go1.22.2", Compiler:"gc", Platform:"linux/amd64"}
Regression
No response
Sysdump
No response
Relevant log output
FRR (from BGP Control Plane's Dev Environment)
router0# show ip bgp sum
IPv4 Unicast Summary (VRF default):
BGP router identifier 10.0.0.1, local AS number 65000 vrf-id 0
BGP table version 3
RIB entries 5, using 960 bytes of memory
Peers 2, using 1434 KiB of memory
Peer groups 1, using 64 bytes of memory
Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd PfxSnt Desc
10.0.1.2 4 0 0 0 0 0 0 never Active 0 N/A
10.0.2.2 4 65002 4 4 0 0 0 00:00:01 3 3 N/A
Total number of neighbors 2
router0# show ip bgp
BGP table version is 6, local router ID is 10.0.0.1, vrf id 0
Default local pref 100, local AS 65000
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found
Network Next Hop Metric LocPrf Weight Path
*= 10.2.0.1/32 10.0.1.2 0 65001 i
*> 10.0.2.2 0 65002 i
*= 10.2.0.10/32 10.0.1.2 0 65001 i
*> 10.0.2.2 0 65002 i
*= 10.2.236.92/32 10.0.1.2 0 65001 i
*> 10.0.2.2 0 65002 i
Displayed 3 routes and 6 total paths
router0# show ip bgp 10.2.0.1
BGP routing table entry for 10.2.0.1/32, version 5
Paths: (2 available, best #2, table default)
Advertised to non peer-group peers:
10.0.1.2 10.0.2.2
65001
10.0.1.2 from 10.0.1.2 (10.0.1.2)
Origin IGP, valid, external, multipath
Community: 303:303
Last update: Mon Nov 4 00:04:01 2024
65002
10.0.2.2 from 10.0.2.2 (10.0.2.2)
Origin IGP, valid, external, multipath, best (Older Path)
Community: 303:303
Last update: Mon Nov 4 00:03:59 2024
Anything else?
No response
Cilium Users Document
Code of Conduct
Is there an existing issue for this?
Version
equal or higher than v1.16.0 and lower than v1.17.0
What happened?
When configuring
CiliumBGPAdvertisementwith overlapping sector-based matches, the last sequential match is used and previous matches are ignored. There are no errors thrown. Based on https://github.com/cilium/cilium/blob/main/pkg/bgpv1/manager/reconcilerv2/service.go#L295 and my own log messages, overlapping matches are carried through to this loop and each previous match is overwritten by the next.I believe an assumption was made that the administrator would only define a single match for each combination of
advertisementTypeandselector. The lack of an error message leads me to believe that I am trying to configure Cilium for a use case that wasn't considered, rather than an intentionally not supported one.As a cloud operator, I have a use case that would ideally be served by setting a dynamic combination of BGP communities based on the workload deployed. As an example, consider:
100:100.110:110.200:200.In this example, all advertisements from the customer's workload include at minimum BGP Community
100:100. Some advertisements may contain BGP Communities100:100and110:110, while others may contain100:100and200:200.I would like to modify Cilium to permit overlapping matches for
CiliumBGPAdvertisement.advertisementswith the following logic:communities(standard,wellKnown, orlarge), treat it as an append operation. All matches within the same type (standardvs.large) are combined.localPreference, regardless of being accompanied bycommunitiesor in isolation, if theirlocalPreferencevalues are different -- throw an error and stop processing the CR.Example:
When processed by Cilium, this is the same as:
How can we reproduce the issue?
Cilium Configuration:
Cilium Version
I am running a local build of Cilium using the BGP Control Plane's Kind dev environment:
Kernel Version
Linux hostname 6.5.0-26-generic #26~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Tue Mar 12 10:22:43 UTC 2 x86_64 x86_64 x86_64 GNU/Linux
Kubernetes Version
Regression
No response
Sysdump
No response
Relevant log output
FRR (from BGP Control Plane's Dev Environment)
Anything else?
No response
Cilium Users Document
Code of Conduct