Conversation
Contributor
Author
|
@brb could you try this in pwru and see how bad things are? |
brb
added a commit
to cilium/pwru
that referenced
this pull request
Apr 28, 2025
Signed-off-by: Martynas Pumputis <[email protected]>
Member
|
@lmb Are you interested in perf benchmarks, or correctness? If the latter, then all good - cilium/pwru#548. Unfortunately, we do not have any benchmarking suites. |
def29e7 to
afa4ee7
Compare
Contributor
Author
|
I added a crude benchmark which just iterates all of vmlinux to simulate pwru. Even that has gotten faster! |
3ba613d to
f9c4803
Compare
Add a benchmark which replicates the types used by Inspektor Gadget for a common configuration. Also add a benchmark which explicitly iterates all types in vmlinux, which is similar to what pwru does. See cilium#1755 (comment) Signed-off-by: Lorenz Bauer <[email protected]>
Stop relying on readAndInflateTypes by creating a full BTF blob in the test. Signed-off-by: Lorenz Bauer <[email protected]>
Signed-off-by: Lorenz Bauer <[email protected]>
mutableTypes is going to be removed when switching to lazy BTF parsing. Move the implementation of AnyTypesByName to Spec. Signed-off-by: Lorenz Bauer <[email protected]>
Lazy decoding of BTF means that iterating over a Spec may produce errors. The current iterator interface doesn't allow returning errors. Replace Iterate() with a real iterator which calls into Spec.TypeByID instead of going via an internal fast path. This also allows moving Datasec fixups in the future. Signed-off-by: Lorenz Bauer <[email protected]>
Some Datasec are not correctly populated when parsing BTF from an ELF file (vs. raw vmlinux BTF). To address this we currently do a full pass over all types to find Datasec that need fixing up. Instead, perform the fixup when retrieving a Type from a Spec. Spec loaded from a raw BTF blob are kept unchanged. Signed-off-by: Lorenz Bauer <[email protected]>
Move the implementation of readAndInflateTypes into a new file. No other changes intended. Signed-off-by: Lorenz Bauer <[email protected]>
Switch BTF decoding from operating on a reader to operating on a byte slice. This simplifies the code and removes the need for a bunch of intermediate buffers. It's also necessary to be able to lazily decode BTF in the future. Signed-off-by: Lorenz Bauer <[email protected]>
Break up readAndInflateTypes into a dedicated decoder type. Constructing the decoder does a single pass over the BTF which collects the offsets of all types in the blob. Signed-off-by: Lorenz Bauer <[email protected]>
Refactor the decoder so that it is possible to unmarshal one type at a time. The main challenges are: * Unmarshaling a type requires unmarshaling all descendants. This is solved by recursively inflating types until reaching a leaf. Care needs to be taken to avoid infinite recursion when dealing with self-referential types like linked list headers. The trick here is that we store a partially inflated Type in d.types so that later function calls return the cached value. The partially inflated Type is removed from d.types on error to avoid polluting the decoder with "broken" types. * Unmarshaling a type may require decoding declTags pointing at that type. This is handled by maintaining a lookaside index which maps type IDs to declTag typeIDs. * Unmarshaling a legacy bitfield also requires a lookaside data structure. Even though the decoder is now capable of lazy decoding we still construct a full []Type when parsing BTF to keep the diff manageable. Signed-off-by: Lorenz Bauer <[email protected]>
Use the lazy decoder to avoid decoding all types upfront. This requires adding an index from essentialName to TypeID to be able to implement AnyTypeByName. decoder has to be safe for concurrent use because we want to share the raw BTF blob and so on when copying a Spec. Signed-off-by: Lorenz Bauer <[email protected]>
It turns out that doing two passes over the encoded BTF is basically free. Use this approach to pre-calculate the correct size for the data structures required by the decoder. This reduces allocations and copying by a substantial amount. Signed-off-by: Lorenz Bauer <[email protected]>
We still construct full slices of various raw BTF types, only to discard them afterwards. Instead, decode only a single raw type and accumulate the inflated equivalents. Signed-off-by: Lorenz Bauer <[email protected]>
Appending to a nil slice allocates a slice with an 8 item capacity as of Go 1.24. This is wasteful since most entries will only ever need a single ID. Do some syntax gymnastics to avoid this waste. Signed-off-by: Lorenz Bauer <[email protected]>
f9c4803 to
ef22608
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Prompted by #1755 another stab at lazy decoding of BTF types. IterateVmlinux is roughly the pwru workload (IIRC) and InspectorGadget does what it says on the tin.