bpf: Add 'host not ready' drop reason if host policy cannot be enforced#29482
Merged
bpf: Add 'host not ready' drop reason if host policy cannot be enforced#29482
Conversation
Contributor
Author
|
/test |
ca3fe28 to
4bf367f
Compare
Contributor
Author
|
/test |
1 similar comment
Contributor
Author
|
/test |
32812b0 to
bc36f7d
Compare
Contributor
Author
|
/test |
bc36f7d to
0895c5f
Compare
Contributor
Author
|
/test |
0895c5f to
9c60f4f
Compare
Contributor
Author
|
/test-e2e-upgrade |
Contributor
Author
|
/test |
Contributor
Author
|
/ci-e2e-upgrade |
50c61ab to
ad8d210
Compare
Contributor
Author
|
/test |
nathanjsweet
approved these changes
Jan 5, 2024
3 tasks
michi-covalent
pushed a commit
that referenced
this pull request
Jul 16, 2024
This excludes the drop reason introduced in #29482. It occurs when Cilium is first installed on a node, the host firewall is enabled, a workload endpoint gets created before the host endpoint, and the workload endpoint in question tries to talk to the host. Preventing these drops would require redesigning parts of the datapath, particularly the clustermesh bootstrap procedure. This is not feasible at the moment, and maybe it's not the right thing to do. Signed-off-by: Timo Beckers <[email protected]>
michi-covalent
pushed a commit
that referenced
this pull request
Aug 5, 2024
[ cherry-picked from cilium/cilium-cli repository ] This excludes the drop reason introduced in #29482. It occurs when Cilium is first installed on a node, the host firewall is enabled, a workload endpoint gets created before the host endpoint, and the workload endpoint in question tries to talk to the host. Preventing these drops would require redesigning parts of the datapath, particularly the clustermesh bootstrap procedure. This is not feasible at the moment, and maybe it's not the right thing to do. Signed-off-by: Timo Beckers <[email protected]>
github-merge-queue bot
pushed a commit
that referenced
this pull request
Aug 16, 2024
[ cherry-picked from cilium/cilium-cli repository ] This excludes the drop reason introduced in #29482. It occurs when Cilium is first installed on a node, the host firewall is enabled, a workload endpoint gets created before the host endpoint, and the workload endpoint in question tries to talk to the host. Preventing these drops would require redesigning parts of the datapath, particularly the clustermesh bootstrap procedure. This is not feasible at the moment, and maybe it's not the right thing to do. Signed-off-by: Timo Beckers <[email protected]>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Repurpose unused drop reasons since 61fb508
("bpf: Rename unused drop defines to DROP_UNUSED*") for signaling the host
endpoint's policy program was attempted to be executed before it was loaded.
bpf_lxc.c contains multiple tail calls into POLICY_CALL_MAP at the HOST_EP_ID
slot. The program in this slot is provided by bpf_host.c. During first agent
startup, there are often multiple Pods pending creation due to no CNI being
available. As soon as the agent's local API becomes available, these outstanding
CNI requests have a chance to be accepted by the API handler.
If one such request is serviced before the host datapath controller has a chance
to grab the compilation lock, the endpoint program will compile and attach first.
If the host firewall is enabled, and this new workload endpoint sends a packet
to the host, the host's ingress policy needs to be enforced. However, because
bpf_host hasn't been loaded yet, this policy program is not yet present in
POLICY_CALL_MAP, resulting in a missed tail call.
One potential solution to this problem would be making sure the host datapath
always attaches before workload endpoints do. There's one problem with this
solution: clustermesh requires data from other clusters in order to correctly
populate the local ipcache, and the ipcache currently needs to be populated for
the host endpoint to finish attaching. It obtains this information through
clustermesh-apiserver, typically deployed onto the local node as a regular Pod.
This means workload endpoints must be able to deploy before the host endpoint.
As a stop gap, tolerate these kinds of drops and assign them a specific meaning,
without letting them spill over into the generic 'missed tail call' bucket. To
stabilize end-to-end tests, we're aiming to enforce zero dropped tail calls in
all CI scenarios, since it leads to packets that mysteriously go missing,
introducing chaos that's impossible to troubleshoot.