Skip to content

bpf:trace: adjust TRACE_PAYLOAD_LEN for overlay packets#38984

Merged
kaworu merged 5 commits intomainfrom
pr/smagnani96/adjust-trace-payload
Jun 18, 2025
Merged

bpf:trace: adjust TRACE_PAYLOAD_LEN for overlay packets#38984
kaworu merged 5 commits intomainfrom
pr/smagnani96/adjust-trace-payload

Conversation

@smagnani96
Copy link
Copy Markdown
Contributor

@smagnani96 smagnani96 commented Apr 16, 2025

Blocked by #39650. Good to have (non-blocking) for #37634.

As described in commit messages, when tracing overlay packets the default TRACE_PAYLOAD_LEN might not be enough. An example would be an Overlay IPv4 packet carrying an IPv6 TCP packet with Options (MSS, etc., such as a SYN packet). In that case, Hubble would fail decoding such packet due to missing bytes. Same problems would arise in case of IPv6 underlay.

For this reason:

  1. move TRACE_PAYLOAD_LEN to CONFIG(trace_payload_len) for coherency with the new datapath config, and use it to trace native packets
  2. introduce CONFIG(trace_payload_len_overlay) and use it for tracing overlay packets
  3. postpone the computation of capture_length when tracing to only when the trace needs to be omitted
  4. add an helper to accomplish (3) based on the current classifier flags: use (2) when CLS_FLAG_{VXLAN,GENEVE} is set, (1) otherwise.

Fixes: #39030.

@smagnani96 smagnani96 added kind/enhancement This would improve or streamline existing functionality. release-note/misc This PR makes changes that have no direct user impact. labels Apr 16, 2025
@smagnani96 smagnani96 force-pushed the pr/smagnani96/adjust-trace-payload branch from 516cdd4 to 6a583b3 Compare April 17, 2025 09:04
@julianwiedmann
Copy link
Copy Markdown
Member

Outer: 14 (ethernet) + 40 (ipv6) + 8 (udp) + 8 (geneve) + 24 (geneve v6 opts)

You might want to consider one level of VLAN header in this calculation, to be on the safe side.

@smagnani96 smagnani96 changed the base branch from main to pr/smagnani96/tracing-struct-refactoring April 17, 2025 13:28
@smagnani96 smagnani96 force-pushed the pr/smagnani96/adjust-trace-payload branch 2 times, most recently from c2de6db to 198645b Compare April 17, 2025 14:09
@smagnani96 smagnani96 changed the base branch from pr/smagnani96/tracing-struct-refactoring to pr/smagnani96/monitor-decode-overlay April 18, 2025 18:38
@smagnani96 smagnani96 force-pushed the pr/smagnani96/adjust-trace-payload branch from 198645b to 41c36ed Compare April 18, 2025 18:38
@smagnani96 smagnani96 had a problem deploying to release-base-images April 18, 2025 18:38 — with GitHub Actions Failure
@smagnani96 smagnani96 force-pushed the pr/smagnani96/monitor-decode-overlay branch from 7e24d18 to 1a3e906 Compare April 18, 2025 18:39
@smagnani96 smagnani96 changed the title bpf:trace: adjust TRACE_PAYLOAD_LEN for IPv6-only Geneve DSR bpf:trace: adjust TRACE_PAYLOAD_LEN for overlay packets Apr 18, 2025
@smagnani96 smagnani96 linked an issue Apr 18, 2025 that may be closed by this pull request
@smagnani96 smagnani96 force-pushed the pr/smagnani96/monitor-decode-overlay branch from 1a3e906 to 2a0de51 Compare April 23, 2025 10:05
@smagnani96 smagnani96 force-pushed the pr/smagnani96/adjust-trace-payload branch 2 times, most recently from ea3a8a6 to 1f3ddaa Compare April 23, 2025 15:13
@smagnani96 smagnani96 force-pushed the pr/smagnani96/monitor-decode-overlay branch from 2a0de51 to 58cbe6a Compare April 28, 2025 11:02
@smagnani96 smagnani96 force-pushed the pr/smagnani96/adjust-trace-payload branch from 1f3ddaa to 7d74669 Compare April 28, 2025 13:11
@smagnani96 smagnani96 force-pushed the pr/smagnani96/adjust-trace-payload branch from 7d74669 to 10f4b10 Compare May 7, 2025 13:48
@smagnani96 smagnani96 changed the base branch from pr/smagnani96/monitor-decode-overlay to pr/smagnani96/bpf-overlay-classifiers June 4, 2025 16:12
@smagnani96 smagnani96 force-pushed the pr/smagnani96/adjust-trace-payload branch from 10f4b10 to 12b48a2 Compare June 4, 2025 22:38
@smagnani96 smagnani96 force-pushed the pr/smagnani96/adjust-trace-payload branch from 12b48a2 to e3a60f0 Compare June 4, 2025 22:56
@smagnani96 smagnani96 added the dont-merge/blocked Another PR must be merged before this one. label Jun 5, 2025
@smagnani96 smagnani96 force-pushed the pr/smagnani96/adjust-trace-payload branch from e3a60f0 to bb12e4e Compare June 5, 2025 07:23
@kaworu kaworu removed the dont-merge/blocked Another PR must be merged before this one. label Jun 17, 2025
@kaworu kaworu added the dont-merge/blocked Another PR must be merged before this one. label Jun 17, 2025
Base automatically changed from pr/smagnani96/bpf-overlay-classifiers to main June 17, 2025 09:38
@kaworu kaworu added dont-merge/needs-rebase This PR needs to be rebased because it has merge conflicts. and removed dont-merge/blocked Another PR must be merged before this one. labels Jun 17, 2025
With this commit, we move TRACE_PAYLOAD_LEN from being a pure macro to being
a config variable. This is purely to adapt to new BPF config and subsequent
changes to introduce a new related config variable for tracing overlay packets.

Signed-off-by: Simone Magnani <[email protected]>
This commit introduces a new node config variable trace_payload_len_overlay.
This variable will be enabled in subsequent commit, so that we can push
more bytes during trace notification for overlay traffic. In fact, the
old TRACE_PAYLOAD_LEN (128B) might not be enough for Hubble to decode
overlay packets such as:

- TCPv6 with options over VXLANv4 (>=134B) -- decode error
- TCPv6 with SRv6 segments (>=136B) -- decode error
- {ICMP,UDP,TCP}v6 over (future) VXLANv6 (>=132B) -- decode error

Given that now the datapath can distinguish between native and overlay
traffic, let's make use of it.

Signed-off-by: Simone Magnani <[email protected]>
With this patch we simply refactors the checks to enable Overlay
classifiers during classification into specific helper functions for
code clarity and reusability (see subsequent patch).

Signed-off-by: Simone Magnani <[email protected]>
@smagnani96 smagnani96 force-pushed the pr/smagnani96/adjust-trace-payload branch from bb12e4e to 7713995 Compare June 17, 2025 12:59
@smagnani96 smagnani96 removed the dont-merge/needs-rebase This PR needs to be rebased because it has merge conflicts. label Jun 17, 2025
This commits adds an helper function `compute_capture_length` to compute the
length of the capture associated to the trace/drop notification event.
While doing this, we now postpone the computation of such value in trace/drop
notification to save computational time and verifier complexity, given
the value would not be used in case the notification should not be emitted.
With the next commit, we further extend this helper to account for
Overlay packets that require additional bytes.

Signed-off-by: Simone Magnani <[email protected]>
@smagnani96 smagnani96 force-pushed the pr/smagnani96/adjust-trace-payload branch from 7713995 to 34ba9bc Compare June 17, 2025 13:02
This commit make use of the previously introduced trace_payload_len_overlay.
Whenever the dataplane is able to classify a packet as OVERLAY, let's
push additional bytes in the notification event to allow Hubble decoding
the packet. An example of failure before this patch is the `TCP data offset
greater than packet length` error. This error can happen with an TCPv6 SYN
(options MSS) over VXLAN: we're pushing 128B, but 144B would be needed for
correctly decoding the TCP options.

The utility function `ctx_capture_length` returns the correct capture
length for the notification event. It makes use of classifiers flags
and observation point. When provided with the monitor value (trace
notifications, as in drop notifications we use 0), the helper accounts for it:

* if `monitor` is 0, then the utility selects the default value based on
  the provided flags, either trace_payload_len or trace_payload_len_overlay.
* if monitor is the default trace_payload_len used in the codebase, then
  the helper still checks the flags and transparently adjusts to
  trace_payload_len_overlay with overlay traffic.

Normal usages should still prefer trace_payload_len in the codebase: the
dapath will transparently adjust it to trace_payload_len_overlay during
the trace notification event if/when needed.

Signed-off-by: Simone Magnani <[email protected]>
@smagnani96 smagnani96 force-pushed the pr/smagnani96/adjust-trace-payload branch from 34ba9bc to eafc457 Compare June 17, 2025 13:04
@smagnani96
Copy link
Copy Markdown
Contributor Author

  • Rebased
  • Refactored ctx_capture_len -> compute_capture_len
  • Refactored supports_overlay_mark -> can_observe_overlay_mark
  • Refactored supports_overlay_header -> can_observe_overlay_hdr
  • Added additional 2 BPF tests for trace_payload_len / trace_payload_len_overlay

No functional changes wrt previous reviews.

@smagnani96
Copy link
Copy Markdown
Contributor Author

/test

@kaworu kaworu added this pull request to the merge queue Jun 18, 2025
Merged via the queue into main with commit 3b9601f Jun 18, 2025
359 of 362 checks passed
@kaworu kaworu deleted the pr/smagnani96/adjust-trace-payload branch June 18, 2025 13:56
@maintainer-s-little-helper maintainer-s-little-helper bot added ready-to-merge This PR has passed all tests and received consensus from code owners to merge. labels Jun 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kind/enhancement This would improve or streamline existing functionality. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/misc This PR makes changes that have no direct user impact.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

monitor:hubble: incorrectly decoded TCPv6-over-vxlan

7 participants