bpf:hubble: update trace/drop notify for L2-less packets#37097
bpf:hubble: update trace/drop notify for L2-less packets#37097joestringer merged 4 commits intocilium:mainfrom
Conversation
e1bb0d2 to
8880c82
Compare
8880c82 to
934c2fd
Compare
934c2fd to
a317f99
Compare
a317f99 to
01f79b3
Compare
|
/test |
jschwinger233
left a comment
There was a problem hiding this comment.
what about the update scenario? If hubble always upgrades before bpf loading, then it's fine.
01f79b3 to
4a2f8c0
Compare
|
/test |
There was a problem hiding this comment.
Thanks @smagnani96 🙏
Some minor comments, the missing correct decoder selection for DropNotify in the Hubble parser, and a suggestion about drop notification versioning.
67deeea to
536dfc0
Compare
kaworu
left a comment
There was a problem hiding this comment.
Thanks @smagnani96 for the quick update, patch LGTM now 🎉
The logic duplication between the Hubble parser and monitor dissect is not ideal but still manageable. Long term we probably want to dedup it.
|
/test |
536dfc0 to
f0e5fe0
Compare
f0e5fe0 to
1ad9f95
Compare
|
/test |
1 similar comment
|
/test |
1ad9f95 to
0395191
Compare
|
/test |
GetConnectionInfo is not referenced nor used in any of our sources. Public API should be `GetConnectionSummary`, which internally calls `getConnectionInfoFromCache`. Signed-off-by: Simone Magnani <[email protected]>
The packet decoder in our Cilium Monitor utility never checks for errors returned from the DecodeLayers function. In Hubble, we do have identical code, but with the additional checks on that returned error code. We should try to keep these tool utilities as aligned as possible to avoid misunderstanding: in case of a decode error, Cilium Monitor would still try to use the decoded info. Since v1.1.18, DecodeLayers returns a non-nil error for an empty packet. Skip decoding and truncate layers to avoid accidental re-use. See google/gopacket#846 Signed-off-by: Simone Magnani <[email protected]>
This patch enabled the correct decoding of trace notifications when emitted from an L3 device (ex. Wireguard). Prior to this patch, our Monitor/Hubble logic always attempted to decode the raw bytes starting from the Ethernet Layer. This of course doesn't hold true for L3 devices, where we never see the L2 layer. Since we already have a flag field in the trace notification with some unused padding, let's borrow a bit to indicate whether the packet is being logged from an L3 device or not. This could have been also achieved from Monitor/Hubble directly, since the notification even has the ifindex, but it would have required to add logic to keep ifindexes related to L3 interfaces up-to-date in Hubble/Monitor. IMHO, (1) this might not scale well, and (2) it is much clearer to have all info forwarded from the datapath. I'm not concerned about performance implication here since no new data is being forwarded in the trace event. Signed-off-by: Simone Magnani <[email protected]>
Similarly as the previous commit, this patch enabled the correct decoding of drop notifications when emitted from an L3 device (ex. Wireguard). Since we don't have a flag field or padding in drop notification struct, let's increment the version and add a `__u32` field to hold flags. For now, we use 1 bit to indicate IPv6 and 1 bit to indicate whether the packet is being dropped from a L3 Device. The other 30 bits are kept as padding the preserve the struct alignment. Signed-off-by: Simone Magnani <[email protected]>
0395191 to
58497fd
Compare
|
/test |
|
|
||
| #if defined(ENABLE_WIREGUARD) && (defined(IS_BPF_HOST) || defined(IS_BPF_WIREGUARD)) | ||
| if (THIS_INTERFACE_IFINDEX == WG_IFINDEX) { | ||
| l3_dev = true; |
There was a problem hiding this comment.
@smagnani96 just driving by - can we make this l3_dev = (ETH_HLEN == 0); instead? That would allow us to capture all types of L3 device, not just the wireguard interface.
Currently we're mistakenly parsing packets sent from an L3 device, as our decoders in Hubble/Monitors assume Ethernet as the first layers of the packet. Therefore, the output looks something like:
Let's fix that by borrowing a bit in the trace/drop notify event to tell whether the packet comes from an L3 device or not:
ctx->protocolto additionally tell whether IPv4/6 (we need these info to know which parser to use)The output after this patch should look as follows:
Fixes: #37095