node/manager: synthesize node deletion events by bimmlerd · Pull Request #33278 · cilium/cilium

bimmlerd · 2024-06-20T13:13:53Z

When the cilium agent is down (due to a crash or an upgrade), it can miss node events. Upon startup, live nodes are upserted, but when deletions are missed, the agent fails to clean up node-related system state. Examples of such state includes bpf map entries, xfrm states or routes. In particular, the agent fails to clean up node IP to nodeID mappings in the nodeid bpf map. Since K8s will happily recycle such IPs, this can lead to breakage, as the agent associate the wrong nodeID with IPs.

To avoid leaking this state, the node manager now dumps its view of the current set of nodes to a file in the runtime state directory, which can be read on restart of an agent. This is similar to how we restore other state upon restart.

When reading this file, it's important to avoid resurrecting long-gone nodes (as we don't know for how long the agent was down) - instead, we merely take note of which nodes we knew of in the past, compare that to the nodes we consider live (once synced to k8s), and delete the ones which seem to have disappeared.

The motivation to build this reconciliation based on full state dumps to disk is that downstream code generally assumes to have access to a full node object in the deletion callbacks. This makes is infeasible to base the pruning on just the information available in bpf maps. In an alternative design, downstream subsystems are responsible for cleaning up their own state based on just a node identifier, but current code doesn't allow for this.

Fixes: #29822

The cilium agent now cleans up stale nodeID mappings and other node-related state on startup

bimmlerd · 2024-06-20T13:45:55Z

/test

bimmlerd · 2024-06-21T11:36:41Z

/scale-100

bimmlerd · 2024-06-21T11:36:45Z

/test (CI was green before the force push, but I wanted to have access to /scale-100)

gandro

Awesome work! Looks excellent to me. Two very minor things

pkg/node/manager/manager.go

bimmlerd · 2024-06-25T13:02:18Z

/test

gandro

Looks good to me now!

pkg/node/manager/manager.go

thorn3r

Nice work 👍

nebril

LGTM, one minor comment left inline.

pkg/node/manager/manager.go

Clearing the environment in the middle of the test can cause failures related to state being deleted, as the "environment" being cleared is simply the StateDir of the agent. Fixes: 940b186 ("test/controlplane: Fix tests after removal of global hives") Signed-off-by: David Bimmler <[email protected]>

When the cilium agent is down (due to a crash or an upgrade), it can miss node events. Upon startup, live nodes are upserted, but when deletions are missed, the agent fails to clean up node-related system state. Examples of such state includes bpf map entries, xfrm states or routes. In particular, the agent fails to clean up node IP to nodeID mappings in the nodeid bpf map. Since K8s will happily recycle such IPs, this can lead to breakage, as the agent associate the wrong nodeID with IPs. To avoid leaking this state, the node manager now dumps its view of the current set of nodes to a file in the runtime state directory, which can be read on restart of an agent. This is similar to how we restore other state upon restart. When reading this file, it's important to avoid resurrecting long-gone nodes (as we don't know for how long the agent was down) - instead, we merely take note of which nodes we knew of in the past, compare that to the nodes we consider live (once synced to k8s), and delete the ones which seem to have disappeared. The motivation to build this reconciliation based on full state dumps to disk is that downstream code generally assumes to have access to a full node object in the deletion callbacks. This makes is infeasible to base the pruning on just the information available in bpf maps. In an alternative design, downstream subsystems are responsible for cleaning up their own state based on just a node identifier, but current code doesn't allow for this. Signed-off-by: David Bimmler <[email protected]>

bimmlerd · 2024-07-01T08:36:53Z

/test

bimmlerd · 2024-07-01T08:37:25Z

@ldelossa friendly ping for review

bimmlerd · 2024-07-05T06:41:35Z

marking for backport to 1.16 as it's a bugfix

maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Jun 20, 2024

bimmlerd added kind/bug This is a bug in the Cilium logic. release-note/bug This PR fixes an issue in a previous release of Cilium. area/agent Cilium agent related. labels Jun 20, 2024

maintainer-s-little-helper bot removed dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. labels Jun 20, 2024

bimmlerd force-pushed the pr/bimmlerd/startup-node-pruning branch 3 times, most recently from 3fb8e90 to 5d8c666 Compare June 20, 2024 13:35

bimmlerd marked this pull request as ready for review June 20, 2024 14:48

bimmlerd requested review from a team as code owners June 20, 2024 14:48

bimmlerd requested review from ldelossa and thorn3r June 20, 2024 14:48

bimmlerd force-pushed the pr/bimmlerd/startup-node-pruning branch from 5d8c666 to d19338c Compare June 21, 2024 11:11

gandro reviewed Jun 24, 2024

View reviewed changes

pkg/node/manager/manager.go Show resolved Hide resolved

pkg/node/manager/manager.go Show resolved Hide resolved

bimmlerd force-pushed the pr/bimmlerd/startup-node-pruning branch from d19338c to bdd860b Compare June 25, 2024 12:42

gandro approved these changes Jun 25, 2024

View reviewed changes

pkg/node/manager/manager.go Show resolved Hide resolved

thorn3r approved these changes Jun 25, 2024

View reviewed changes

bimmlerd force-pushed the pr/bimmlerd/startup-node-pruning branch from bdd860b to d0f2d92 Compare June 27, 2024 09:07

bimmlerd requested a review from a team as a code owner June 27, 2024 09:07

bimmlerd requested a review from nebril June 27, 2024 09:07

nebril approved these changes Jun 27, 2024

View reviewed changes

pkg/node/manager/manager.go Outdated Show resolved Hide resolved

bimmlerd added 2 commits June 28, 2024 08:50

bimmlerd force-pushed the pr/bimmlerd/startup-node-pruning branch from d0f2d92 to e1871a7 Compare June 28, 2024 06:50

aanm enabled auto-merge July 4, 2024 14:41

borkmann approved these changes Jul 4, 2024

View reviewed changes

aanm added this pull request to the merge queue Jul 4, 2024

Merged via the queue into cilium:main with commit b855b25 Jul 4, 2024

bimmlerd deleted the pr/bimmlerd/startup-node-pruning branch July 5, 2024 06:41

bimmlerd added the needs-backport/1.16 label Jul 5, 2024

jibi mentioned this pull request Jul 8, 2024

v1.16 Backports 2024-07-08 #33630

Merged

32 tasks

jibi added backport-pending/1.16 and removed needs-backport/1.16 labels Jul 8, 2024

julianwiedmann added backport-done/1.16 The backport for Cilium 1.16.x for this PR is done. and removed backport-pending/1.16 labels Jul 12, 2024

cilium-release-bot bot mentioned this pull request Jul 15, 2024

Prepare for release v1.16.0-rc.2 #33831

Merged

julianwiedmann mentioned this pull request Aug 1, 2024

IPSec does not clean up stale xfrm policies #26298

Closed

1 task

bimmlerd mentioned this pull request Nov 1, 2024

Wireguard Allowed IPs not propagated correctly after node restart #35644

Closed

3 tasks

bimmlerd mentioned this pull request Nov 20, 2024

Node ID to IP mappings become inconsistent with out of order events #36052

Open

julianwiedmann mentioned this pull request Mar 7, 2025

Support tunnel routing in Multi-Pool IPAM mode #38015

Closed

Conversation

bimmlerd commented Jun 20, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bimmlerd commented Jun 20, 2024

Uh oh!

bimmlerd commented Jun 21, 2024

Uh oh!

bimmlerd commented Jun 21, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gandro left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

bimmlerd commented Jun 25, 2024

Uh oh!

gandro left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

thorn3r left a comment

Choose a reason for hiding this comment

Uh oh!

nebril left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

bimmlerd commented Jul 1, 2024

Uh oh!

bimmlerd commented Jul 1, 2024

Uh oh!

bimmlerd commented Jul 5, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

bimmlerd commented Jun 20, 2024 •

edited

Loading

bimmlerd commented Jun 21, 2024 •

edited

Loading