Skip to content

Error removing identity not added to the identity manager! on agent init #16419

@pchaigno

Description

@pchaigno

When restoring endpoints on startup there can be a race between two threads to update the identity. The first thread sets the identity for restored endpoints, while the second recomputes the identity if the pod (or node in case of the host endpoint) labels changed. If the second thread register an identity in the identity manager before the first, it results in the error removing identity not added to the identity manager!.

This error seems to be more frequent with the host endpoint because the identity update from label update might happen sooner, via InitHostEndpointLabels().

The following log trace shows the error happening, with the second thread trying to update (by first removing) the host endpoint's identity from the manager at 22:08:47.03, and the first thread setting the identity only later, at 22:08:57.13.

2021-06-02T22:08:47.037805506Z level=debug msg="Refreshing labels of endpoint" containerID= endpointID=2653 identityLabels="k8s:cilium.io/ci-node=k8s1,k8s:node-role.kubernetes.io/control-plane,k8s:node-role.kubernetes.io/master,reserved:host" infoLabels="k8s:cilium.io/ci-node=k8s1,k8s:node-role.kubernetes.io/control-plane,k8s:node-role.kubernetes.io/master,reserved:host" subsys=endpoint
2021-06-02T22:08:47.037816164Z level=info msg="Resolving identity labels (blocking)" containerID= datapathPolicyRevision=0 desiredPolicyRevision=0 endpointID=2653 identity=1 identityLabels="k8s:cilium.io/ci-node=k8s1,k8s:node-role.kubernetes.io/control-plane,k8s:node-role.kubernetes.io/master,reserved:host" ipv4= ipv6= k8sPodName=/ subsys=endpoint
2021-06-02T22:08:47.037820655Z level=debug msg="Resolving identity for labels" containerID= datapathPolicyRevision=0 desiredPolicyRevision=0 endpointID=2653 identity=1 identityLabels="k8s:cilium.io/ci-node=k8s1,k8s:node-role.kubernetes.io/control-plane,k8s:node-role.kubernetes.io/master,reserved:host" ipv4= ipv6= k8sPodName=/ subsys=endpoint
2021-06-02T22:08:47.037824485Z level=debug msg="Resolving identity" identityLabels="k8s:cilium.io/ci-node=k8s1,k8s:node-role.kubernetes.io/control-plane,k8s:node-role.kubernetes.io/master,reserved:host" subsys=identity-cache
2021-06-02T22:08:47.037828071Z level=debug msg="Resolved reserved identity" identity=host identityLabels="k8s:cilium.io/ci-node=k8s1,k8s:node-role.kubernetes.io/control-plane,k8s:node-role.kubernetes.io/master,reserved:host" isNew=false subsys=identity-cache
2021-06-02T22:08:47.037831528Z level=debug msg="Assigned new identity to endpoint" containerID= datapathPolicyRevision=0 desiredPolicyRevision=0 endpointID=2653 identity=1 identityLabels="k8s:cilium.io/ci-node=k8s1,k8s:node-role.kubernetes.io/control-plane,k8s:node-role.kubernetes.io/master,reserved:host" ipv4= ipv6= k8sPodName=/ subsys=endpoint
2021-06-02T22:08:47.037835485Z level=debug msg="removing old and adding new identity" new=1 old=1 subsys=identitymanager
2021-06-02T22:08:47.037838767Z level=error msg="removing identity not added to the identity manager!" identity=1 subsys=identitymanager
[...]
2021-06-02T22:08:57.133117604Z level=info msg="Restored endpoint" endpointID=2653 ipAddr="[ ]" subsys=endpoint

This error happens regularly in CI because the host firewall tests set and unset a node label.

Metadata

Metadata

Assignees

Labels

area/agentCilium agent related.kind/bugThis is a bug in the Cilium logic.pinnedThese issues are not marked stale by our issue bot.

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions