bpf:classifiers: add Overlay VXLAN/Geneve classifiers#39650
Conversation
cb56786 to
36eb42c
Compare
|
/test |
2f5096c to
11a2aa6
Compare
|
/test |
|
This zip contains charts of all the |
julianwiedmann
left a comment
There was a problem hiding this comment.
Very cool! Right now this only gets used deep down in the trace/drop notification, correct? And the drop path is boring from a complexity point-of-view, because it lives in its own tailcall.
Could you motivate the distinction between VXLAN and GENEVE? Is it so that hubble can easily parse the optional headers of the GENEVE packet?
d8091f8 to
74aa402
Compare
|
/ci-verifier |
74aa402 to
252897a
Compare
|
/ci-verifier |
252897a to
9fa5345
Compare
|
/test |
We introduced in cilium#39650 the initial support to signal from the datapath trace/drop notify events related to overlay packets. This patch allows our Monitor tool to decode overlay packets and output additional information regarding the encap'd packet. Prior to these changes: ```bash -> network flow 0x4e41ca37 , identity host->unknown state unknown ifindex eth0 orig-ip 0.0.0.0: 10.244.2.63:4240 -> 10.244.1.6:41068 tcp ACK <- network flow 0xb55b35e9 , identity remote-node->unknown state unknown ifindex eth0 orig-ip 0.0.0.0: 10.244.1.6:41068 -> 10.244.2.63:4240 tcp ACK ``` After applying the patch: ```bash -> network flow 0x4e41ca37 , identity host->unknown state unknown ifindex eth0 orig-ip 0.0.0.0: 10.244.2.63:4240 -> 10.244.1.6:41068 tcp ACK [tunnel 172.18.0.4:47375 -> 172.18.0.2:8472 vxlan] <- network flow 0xb55b35e9 , identity remote-node->unknown state unknown ifindex eth0 orig-ip 0.0.0.0: 10.244.1.6:41068 -> 10.244.2.63:4240 tcp ACK [tunnel 172.18.0.2:46960 -> 172.18.0.4:8472 vxlan] ``` Signed-off-by: Simone Magnani <[email protected]>
julianwiedmann
left a comment
There was a problem hiding this comment.
👋 had a first look, some comments in-line.
One structural suggestion - I'd split off all the send_trace_notifcy() refactors where you need to touch the whole tree, and push them into their own PR. They are innocent enough (a bit of renaming + macro magic), but have much higher conflict potential. So let's get those in first, and then do a second PR that only focuses on extending the inner mechanics of the classifier.
We introduced in cilium#39650 the initial support to signal from the datapath trace/drop notify events related to overlay packets. This patch allows our Monitor tool to decode overlay packets and output additional information regarding the encap'd packet. Prior to these changes: ```bash -> network flow 0x4e41ca37 , identity host->unknown state unknown ifindex eth0 orig-ip 0.0.0.0: 10.244.2.63:4240 -> 10.244.1.6:41068 tcp ACK <- network flow 0xb55b35e9 , identity remote-node->unknown state unknown ifindex eth0 orig-ip 0.0.0.0: 10.244.1.6:41068 -> 10.244.2.63:4240 tcp ACK ``` After applying the patch: ```bash -> network flow 0x4e41ca37 , identity host->unknown state unknown ifindex eth0 orig-ip 0.0.0.0: 10.244.2.63:4240 -> 10.244.1.6:41068 tcp ACK [tunnel 172.18.0.4:47375 -> 172.18.0.2:8472 vxlan] <- network flow 0xb55b35e9 , identity remote-node->unknown state unknown ifindex eth0 orig-ip 0.0.0.0: 10.244.1.6:41068 -> 10.244.2.63:4240 tcp ACK [tunnel 172.18.0.2:46960 -> 172.18.0.4:8472 vxlan] ``` Signed-off-by: Simone Magnani <[email protected]>
|
I just pushed new changes based on Julian's suggestions (marked them as solved, PTAL for further details). We significantly preserve verifier complexity wrt the previous implementation. This also translates into less performance degradation, given the additional logic in
Testing this PR on top of #37634 seems to provide good output both when:
Let's proceed and then enhance/introduce if we notice we're missing some bits. |
With this commit, we prepare our test suite for classifiers for the new values introduced in subsequent commits. This allows us to write reusable tests for both IPv4/IPv6, while also specifying subtests for each possible category of flags (CLS_FLAG_*). Signed-off-by: Simone Magnani <[email protected]>
In #39794, we enabled passing the carried L3 protocol when calling `send_trace_notify`. Let's use this argument for `ctx_classify`. This allows to set the CLS_FLAG_IPV6 whenever we detect an IPV6 packet, and not only when having a layer 3 IPv6 packet. In addition, this is useful in subsequent commits where we extend classifiers: the new logic will save verifier complexity and packet processing time given we have already the protocol computed. For some edge case in which the variable protocol is left to zero and not being set to the current protocol (ex. from drop notification or when being called after a failing validate_ethertype), we can compute the protocol from `ctx_classify` itself. Here we also initialize the proto variable in bpf_lxc.c to avoid compilation errors due to variable not being initialized. This was not needed before as it was not being used, therefore removed during compilation. Signed-off-by: Simone Magnani <[email protected]>
This commits introduces two new classifiers CLS_FLAG_{VXLAN,GENEVE}.
This are used to signal from datapath that the current packet being
traced/dropped is using one of these two protocols as underlay.
This is not used for computing bpf metrics so far, but will be useful
in a subsequent PR for decoding encap'd layer from Hubble/Monitor.
Having different signals for VXLAN and Geneve traffic is actually useful
so that Hubble/Monitor does not need to check the Cilium configuration
to retrieve which tunnel protocol is in use. This way, the datapath
automatically marks every overlay notification, and Hubble can easily
chose between the next decoding layer to pick. In addition, we can start
sending more/less packet during notification events based on the packet
itself, if it is native or overlay.
With this patch, we introduce two possible ways of determining a packet
is overlay:
1. ctx->mark: available only with SKB, we look whether the mark contains
MARK_MAGIC_OVERLAY. That value is usually set on our egress
path. This has the advantage that is fast, as it does not
require to further inspect packet L2/L3/L4 headers.
2. raw headers: this method inspects the whole packet layers up to L4 UDP.
By looking at protocol and known ports (TUNNEL_PORT in use),
we can determine a packet is overlay.
While (2) can always work in every condition, (1) works only when the mark
is set. This means that in all codepaths where the mark is cleared it can
fail. It would be still backed up by logic in (2), given that it will
always run in case (1) was not informative enough.
Signed-off-by: Simone Magnani <[email protected]>
Simply moving the definition of available `enum trace_point` into a different file to avoid circular dependencies in subsequent commits. Signed-off-by: Simone Magnani <[email protected]>
This patch introduces a new `enum trace_point` TRACE_POINT_UNKNOWN. This is not intended to be used for `send_trace_notify` notification events, as it will be translated into a compile error. The new observation point is for special cases, such as the one enabled in the next commit. Given that drop notification (1) do not account for the observation point and (2) are executed in custom tailcalls, we can use the new TRACE_POINT_UNKNOWN in the next commit where we optimize classifiers based on the trace observation point. The TRACE_POINT_UNKNOWN allows to run all the possible classification logic, both via mark and packet parsing. Signed-off-by: Simone Magnani <[email protected]>
This commit patches `ctx_classify` to account for the current trace
observation point. This allows to preserve both (1) packet processing
performance in all the codepaths where such overlay flags would never
be computed, and (2) preserve verifier complexity with a minimal impact.
The codepaths where we can encounter and trace overlay packets are:
1. to-{wireguard,netdev}: ctx->mark set to MARK_MAGIC_OVERLAY
2. from-{netdev,wireguard}: no mark, need to check UDP + tunnel port.
3. to-stack: with IPSec enabled, we trace TO_STACK events when forwarding
overlay packets to the stack for encryption (VinE).
Signed-off-by: Simone Magnani <[email protected]>
4ba82a9 to
fab75ff
Compare
|
/test |
We introduced in cilium#39650 the initial support to signal from the datapath trace/drop notify events related to overlay packets. This patch allows our Monitor tool to decode overlay packets and output additional information regarding the encap'd packet. Prior to these changes: ```bash -> network flow 0x4e41ca37 , identity host->unknown state unknown ifindex eth0 orig-ip 0.0.0.0: 10.244.2.63:4240 -> 10.244.1.6:41068 tcp ACK <- network flow 0xb55b35e9 , identity remote-node->unknown state unknown ifindex eth0 orig-ip 0.0.0.0: 10.244.1.6:41068 -> 10.244.2.63:4240 tcp ACK ``` After applying the patch: ```bash -> network flow 0x4e41ca37 , identity host->unknown state unknown ifindex eth0 orig-ip 0.0.0.0: 10.244.2.63:4240 -> 10.244.1.6:41068 tcp ACK [tunnel 172.18.0.4:47375 -> 172.18.0.2:8472 vxlan] <- network flow 0xb55b35e9 , identity remote-node->unknown state unknown ifindex eth0 orig-ip 0.0.0.0: 10.244.1.6:41068 -> 10.244.2.63:4240 tcp ACK [tunnel 172.18.0.2:46960 -> 172.18.0.4:8472 vxlan] ``` Signed-off-by: Simone Magnani <[email protected]>
We introduced in cilium#39650 the initial support to signal from the datapath trace/drop notify events related to overlay packets. This patch allows our Monitor tool to decode overlay packets and output additional information regarding the encap'd packet. Prior to these changes: ```bash -> network flow 0x4e41ca37 , identity host->unknown state unknown ifindex eth0 orig-ip 0.0.0.0: 10.244.2.63:4240 -> 10.244.1.6:41068 tcp ACK <- network flow 0xb55b35e9 , identity remote-node->unknown state unknown ifindex eth0 orig-ip 0.0.0.0: 10.244.1.6:41068 -> 10.244.2.63:4240 tcp ACK ``` After applying the patch: ```bash -> network flow 0x4e41ca37 , identity host->unknown state unknown ifindex eth0 orig-ip 0.0.0.0: 10.244.2.63:4240 -> 10.244.1.6:41068 tcp ACK [tunnel 172.18.0.4:47375 -> 172.18.0.2:8472 vxlan] <- network flow 0xb55b35e9 , identity remote-node->unknown state unknown ifindex eth0 orig-ip 0.0.0.0: 10.244.1.6:41068 -> 10.244.2.63:4240 tcp ACK [tunnel 172.18.0.2:46960 -> 172.18.0.4:8472 vxlan] ``` Signed-off-by: Simone Magnani <[email protected]>
We introduced in cilium#39650 the initial support to signal from the datapath trace/drop notify events related to overlay packets. This patch allows our Monitor tool to decode overlay packets and output additional information regarding the encap'd packet. Prior to these changes: ```bash -> network flow 0x4e41ca37 , identity host->unknown state unknown ifindex eth0 orig-ip 0.0.0.0: 10.244.2.63:4240 -> 10.244.1.6:41068 tcp ACK <- network flow 0xb55b35e9 , identity remote-node->unknown state unknown ifindex eth0 orig-ip 0.0.0.0: 10.244.1.6:41068 -> 10.244.2.63:4240 tcp ACK ``` After applying the patch: ```bash -> network flow 0x4e41ca37 , identity host->unknown state unknown ifindex eth0 orig-ip 0.0.0.0: 10.244.2.63:4240 -> 10.244.1.6:41068 tcp ACK [tunnel 172.18.0.4:47375 -> 172.18.0.2:8472 vxlan] <- network flow 0xb55b35e9 , identity remote-node->unknown state unknown ifindex eth0 orig-ip 0.0.0.0: 10.244.1.6:41068 -> 10.244.2.63:4240 tcp ACK [tunnel 172.18.0.2:46960 -> 172.18.0.4:8472 vxlan] ``` Signed-off-by: Simone Magnani <[email protected]>
We introduced in cilium#39650 the initial support to signal from the datapath trace/drop notify events related to overlay packets. This patch allows our Monitor tool to decode overlay packets and output additional information regarding the encap'd packet. Prior to these changes: ```bash -> network flow 0x4e41ca37 , identity host->unknown state unknown ifindex eth0 orig-ip 0.0.0.0: 10.244.2.63:4240 -> 10.244.1.6:41068 tcp ACK <- network flow 0xb55b35e9 , identity remote-node->unknown state unknown ifindex eth0 orig-ip 0.0.0.0: 10.244.1.6:41068 -> 10.244.2.63:4240 tcp ACK ``` After applying the patch: ```bash -> network flow 0x4e41ca37 , identity host->unknown state unknown ifindex eth0 orig-ip 0.0.0.0: 10.244.2.63:4240 -> 10.244.1.6:41068 tcp ACK [tunnel 172.18.0.4:47375 -> 172.18.0.2:8472 vxlan] <- network flow 0xb55b35e9 , identity remote-node->unknown state unknown ifindex eth0 orig-ip 0.0.0.0: 10.244.1.6:41068 -> 10.244.2.63:4240 tcp ACK [tunnel 172.18.0.2:46960 -> 172.18.0.4:8472 vxlan] ``` Signed-off-by: Simone Magnani <[email protected]>
We introduced in #39650 the initial support to signal from the datapath trace/drop notify events related to overlay packets. This patch allows our Monitor tool to decode overlay packets and output additional information regarding the encap'd packet. Prior to these changes: ```bash -> network flow 0x4e41ca37 , identity host->unknown state unknown ifindex eth0 orig-ip 0.0.0.0: 10.244.2.63:4240 -> 10.244.1.6:41068 tcp ACK <- network flow 0xb55b35e9 , identity remote-node->unknown state unknown ifindex eth0 orig-ip 0.0.0.0: 10.244.1.6:41068 -> 10.244.2.63:4240 tcp ACK ``` After applying the patch: ```bash -> network flow 0x4e41ca37 , identity host->unknown state unknown ifindex eth0 orig-ip 0.0.0.0: 10.244.2.63:4240 -> 10.244.1.6:41068 tcp ACK [tunnel 172.18.0.4:47375 -> 172.18.0.2:8472 vxlan] <- network flow 0xb55b35e9 , identity remote-node->unknown state unknown ifindex eth0 orig-ip 0.0.0.0: 10.244.1.6:41068 -> 10.244.2.63:4240 tcp ACK [tunnel 172.18.0.2:46960 -> 172.18.0.4:8472 vxlan] ``` Signed-off-by: Simone Magnani <[email protected]>
With this patch, the datapath is now able to signal Overlay packets during Trace/Drop notification events.
This is useful for two main reasons (treated in separate upcoming PRs):
These new flags are solely used for tracing events and they do not interfere with bpf metrics computation.
The VXLAN/GENEVE flags can be computed either:
Please do refer to commit messages for a detailed description and considerations of the introduced changes.
Fixes: #39032