Following came up in cilium-cli testing:
[gke_***_us-west2-a_cilium-cilium-cli-936991635] Waiting for Cilium pod kube-system/cilium-9h9vv to have all the pod IPs in eBPF ipcache...
Error: Connectivity test failed: timeout reached waiting for pod IDs in ipcache of Cilium pod kube-system/cilium-9h9vv
error: cannot exec into a container in a completed pod; current phase is Failed
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
cilium-test client-8655c47fd7-gw9t9 1/1 Running 0 12m 10.12.1.26 gke-cilium-cilium-cli-93-default-pool-9c603183-jt1d <none> <none>
cilium-test client2-657df6649d-46lm9 1/1 Running 0 12m 10.12.0.1 gke-cilium-cilium-cli-93-default-pool-9c603183-51zv <none> <none>
See how client2 IP address is 10.12.0.1?
bpf endpoint list tells it is a “localhost”:
IP ADDRESS LOCAL ENDPOINT INFO
10.12.0.123:0 id=2225 flags=0x0000 ifindex=15 mac=16:8B:35:19:4D:6A nodemac=56:E9:C3:93:D4:EC
10.12.0.140:0 id=468 flags=0x0000 ifindex=21 mac=92:59:2B:7D:04:6D nodemac=C6:A0:DC:10:D8:68
10.168.0.25:0 (localhost)
10.12.0.1:0 (localhost)
10.12.0.98:0 (localhost)
10.12.0.169:0 id=2718 flags=0x0000 ifindex=17 mac=6A:8A:A6:E3:E1:22 nodemac=5A:AB:9F:23:44:CA
ipcache says the same:
CEP tells a different story:
external-identifiers:
container-id: 96fc8c05cfa3b7320b8f7f96c920beb1714ca8030e37d8661a2f21318c4baee5
k8s-namespace: cilium-test
k8s-pod-name: client2-657df6649d-46lm9
pod-name: cilium-test/client2-657df6649d-46lm9
id: 105
identity:
id: 5139
labels:
- k8s:io.cilium.k8s.policy.cluster=cilium-cilium-cli-936991635
- k8s:io.cilium.k8s.policy.serviceaccount=default
- k8s:io.kubernetes.pod.namespace=cilium-test
- k8s:kind=client
- k8s:name=client2
- k8s:other=client
networking:
addressing:
- ipv4: 10.12.0.1
node: 10.168.0.25
state: ready
The ipcache of the other Cilium node (not the one hosting the pod) has the correct entry in its ipcache:
10.12.0.1/32 5139 0 10.168.0.25
Cilium agent logs for the IP address and the endpoint:
2021-06-14T19:57:05.350225182Z level=info msg=" --native-routing-cidr='10.12.0.0/14'" subsys=daemon
...
2021-06-14T19:57:06.036223076Z level=info msg="Inheriting MTU from external network interface" device=vethfa1ca2a6 ipAddr=10.12.0.1 mtu=1460 subsys=mtu
...
2021-06-14T19:57:07.084221220Z level=info msg="Addressing information:" subsys=daemon
2021-06-14T19:57:07.084226830Z level=info msg=" Cluster-Name: cilium-cilium-cli-936991635" subsys=daemon
2021-06-14T19:57:07.084233530Z level=info msg=" Cluster-ID: 0" subsys=daemon
2021-06-14T19:57:07.084240329Z level=info msg=" Local node-name: gke-cilium-cilium-cli-93-default-pool-9c603183-51zv" subsys=daemon
2021-06-14T19:57:07.084246051Z level=info msg=" Node-IPv6: <nil>" subsys=daemon
2021-06-14T19:57:07.084252017Z level=info msg=" External-Node IPv4: 10.168.0.25" subsys=daemon
2021-06-14T19:57:07.084257553Z level=info msg=" Internal-Node IPv4: 10.12.0.98" subsys=daemon
2021-06-14T19:57:07.084263080Z level=info msg=" IPv4 allocation prefix: 10.12.0.0/24" subsys=daemon
2021-06-14T19:57:07.084268447Z level=info msg=" IPv4 native routing prefix: 10.12.0.0/14" subsys=daemon
2021-06-14T19:57:07.084274334Z level=info msg=" Loopback IPv4: 169.254.42.1" subsys=daemon
2021-06-14T19:57:07.084279828Z level=info msg=" Local IPv4 addresses:" subsys=daemon
2021-06-14T19:57:07.084311624Z level=info msg=" - 10.168.0.25" subsys=daemon
2021-06-14T19:57:07.084318146Z level=info msg=" - 10.12.0.1" subsys=daemon
2021-06-14T19:57:07.084323659Z level=info msg=" - 10.12.0.1" subsys=daemon
2021-06-14T19:57:07.084329084Z level=info msg=" - 10.12.0.1" subsys=daemon
2021-06-14T19:57:07.084334735Z level=info msg=" - 10.12.0.1" subsys=daemon
...
2021-06-14T19:57:34.905555589Z level=info msg="Create endpoint request" addressing="&{10.12.0.1 c310b283-cd4a-11eb-ade5-42010aa80019 }" containerID=96fc8c05cfa3b7320b8f7f96c920beb1714ca8030e37d8661a2f21318c4baee5 datapathConfiguration="&{false true false true 0xc001d317ea}" interface=lxc022e741785fa k8sPodName=cilium-test/client2-657df6649d-46lm9 labels="[]" subsys=daemon sync-build=true
2021-06-14T19:57:34.905689952Z level=info msg="New endpoint" containerID= datapathPolicyRevision=0 desiredPolicyRevision=0 endpointID=105 ipv4= ipv6= k8sPodName=/ subsys=endpoint
2021-06-14T19:57:34.905963746Z level=info msg="Resolving identity labels (blocking)" containerID= datapathPolicyRevision=0 desiredPolicyRevision=0 endpointID=105 identityLabels="k8s:io.cilium.k8s.policy.cluster=cilium-cilium-cli-936991635,k8s:io.cilium.k8s.policy.serviceaccount=default,k8s:io.kubernetes.pod.namespace=cilium-test,k8s:kind=client,k8s:name=client2,k8s:other=client" ipv4= ipv6= k8sPodName=/ subsys=endpoint
2021-06-14T19:57:34.906151677Z level=info msg="Skipped non-kubernetes labels when labelling ciliumidentity. All labels will still be used in identity determination" labels="map[]" subsys=crd-allocator
2021-06-14T19:57:34.914948184Z level=info msg="Allocated new global key" key="k8s:io.cilium.k8s.policy.cluster=cilium-cilium-cli-936991635;k8s:io.cilium.k8s.policy.serviceaccount=default;k8s:io.kubernetes.pod.namespace=cilium-test;k8s:kind=client;k8s:name=client2;k8s:other=client;" subsys=allocator
2021-06-14T19:57:34.915069668Z level=info msg="Identity of endpoint changed" containerID= datapathPolicyRevision=0 desiredPolicyRevision=0 endpointID=105 identity=5139 identityLabels="k8s:io.cilium.k8s.policy.cluster=cilium-cilium-cli-936991635,k8s:io.cilium.k8s.policy.serviceaccount=default,k8s:io.kubernetes.pod.namespace=cilium-test,k8s:kind=client,k8s:name=client2,k8s:other=client" ipv4= ipv6= k8sPodName=/ oldIdentity="no identity" subsys=endpoint
2021-06-14T19:57:34.915084884Z level=info msg="Waiting for endpoint to be generated" containerID= datapathPolicyRevision=0 desiredPolicyRevision=0 endpointID=105 identity=5139 ipv4= ipv6= k8sPodName=/ subsys=endpoint
2021-06-14T19:57:35.718697564Z level=info msg="regenerating all endpoints" reason="one or more identities created or deleted" subsys=endpoint-manager
2021-06-14T19:57:36.014673719Z level=info msg="Rewrote endpoint BPF program" containerID= datapathPolicyRevision=0 desiredPolicyRevision=1 endpointID=105 identity=5139 ipv4= ipv6= k8sPodName=/ subsys=endpoint
2021-06-14T19:57:36.015919337Z level=info msg="Successful endpoint creation" containerID= datapathPolicyRevision=1 desiredPolicyRevision=1 endpointID=105 identity=5139 ipv4= ipv6= k8sPodName=/ subsys=daemon
2021-06-14T19:57:36.018217418Z level=info msg="API call has been processed" name=endpoint-create processingDuration=1.112696121s subsys=rate totalDuration=1.112817363s uuid=c313ad32-cd4a-11eb-ade5-42010aa80019 waitDurationTotal=0s
...
2021-06-14T19:57:39.126518574Z level=warning msg="Unable to update ipcache map entry on pod add" error="ipcache entry for podIP 10.12.0.1 owned by kvstore or agent" hostIP=10.12.0.1 k8sNamespace=cilium-test k8sPodName=client2-657df6649d-46lm9 podIP=10.12.0.1 podIPs="[{10.12.0.1}]" subsys=k8s-watcher
General Information
- Cilium version: 1.9.8
- Kernel version:
Linux gke-cilium-cilium-cli-93-default-pool-9c603183-51zv 5.4.89+ #1 SMP Sat Feb 13 19:45:14 PST 2021 x86_64 x86_64 x86_64 GNU/Linux
- Orchestration system version in use:
Client Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.1-5-g76a04fc", GitCommit:"95881afb5df065c250d98cf7f30ee4bb6d281acf", GitTreeState:"clean", BuildDate:"2021-05-21T04:25:07Z", GoVersion:"go1.15.7", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"19+", GitVersion:"v1.19.9-gke.1900", GitCommit:"008fd38bf3dc201bebdd4fe26edf9bf87478309a", GitTreeState:"clean", BuildDate:"2021-04-14T09:22:08Z", GoVersion:"go1.15.8b5", Compiler:"gc", Platform:"linux/amd64"}
cilium-sysdump-out.zip
How to reproduce the issue
Don't know how to reproduce, but happens as test flake.
Following came up in cilium-cli testing:
See how client2 IP address is 10.12.0.1?
bpf endpoint list tells it is a “localhost”:
ipcache says the same:
CEP tells a different story:
The ipcache of the other Cilium node (not the one hosting the pod) has the correct entry in its ipcache:
Cilium agent logs for the IP address and the endpoint:
General Information
cilium-sysdump-out.zip
How to reproduce the issue
Don't know how to reproduce, but happens as test flake.