Is there an existing issue for this?
Version
equal or higher than v1.16.11 and lower than v1.17.0
What happened?
Hi
For some reason when I have a LoadBalancer service with no endpoints (due to health checks etc) and BGP (with anycast using ECMP on the TOR/router level), it just drops all traffic to that given IP from within the k8s cluster when no endpoints exist and externalTrafficPolicy: Local
How can we reproduce the issue?
- install Cilium
- set up BGP with switch/router
- Create a cilium loadbalancer with BGP
- Create a svc with no existing endpoints like this
apiVersion: v1
kind: Service
metadata:
annotations:
io.cilium/lb-ipam-ips: 10.229.8.15
io.cilium/lb-ipam-sharing-key: ashley1.oa
lbipam.cilium.io/ips: 10.229.8.15
lbipam.cilium.io/sharing-key: ashley1.oa
labels:
bgp: yespls
name: apiserver-test
spec:
allocateLoadBalancerNodePorts: false
externalTrafficPolicy: Local
internalTrafficPolicy: Cluster
loadBalancerClass: io.cilium/bgp-control-plane
ports:
- name: apiserver
port: 6443
protocol: TCP
targetPort: 6443
selector:
app: apiservers
sessionAffinity: None
type: LoadBalancer
- have NO endpoints working
- try CURLing to the
10.229.8.15 IP from within the cluster
- observe
root@worker-az2-b8f5bfb5-l6h4q:/# curl -v http://10.229.8.15:6443
* Trying 10.229.8.15:6443...
* Immediate connect fail for 10.229.8.15: Operation not permitted
* Closing connection 0
curl: (7) Couldn't connect to server
- Observe IP being dropped somewhere in bpf code instead of being let go through the BGP completely disobeying any reason to ever use BGP and anycast
Cilium Version
tested with 1.15.x (multiple versions) and 1.16.8
Kernel Version
6.11.0 branch, Ubuntu
Kubernetes Version
tested 1.32.x and 1.33.x
Regression
No response
Sysdump
No response
Relevant log output
Anything else?
No response
Cilium Users Document
Code of Conduct
Is there an existing issue for this?
Version
equal or higher than v1.16.11 and lower than v1.17.0
What happened?
Hi
For some reason when I have a LoadBalancer service with no endpoints (due to health checks etc) and BGP (with anycast using ECMP on the TOR/router level), it just drops all traffic to that given IP from within the k8s cluster when no endpoints exist and externalTrafficPolicy: Local
How can we reproduce the issue?
10.229.8.15IP from within the clusterCilium Version
tested with 1.15.x (multiple versions) and 1.16.8
Kernel Version
6.11.0 branch, Ubuntu
Kubernetes Version
tested 1.32.x and 1.33.x
Regression
No response
Sysdump
No response
Relevant log output
Anything else?
No response
Cilium Users Document
Code of Conduct