btf: lazy decoding of string table by lmb · Pull Request #1772 · cilium/ebpf

lmb · 2025-05-01T14:38:03Z

Most of the time in parsing vmlinux BTF is spent in constructing the essentialName -> TypeID index.

We need to allocate a lot of strings
We need to do a lot of hash table lookups

Replace the hash table with a "fuzzy" index, which doesn't require allocating strings. The trade-off is that lookups now become more expensive, but that is a fine trade-off to make.

core: 1
goos: linux
goarch: amd64
pkg: github.com/cilium/ebpf/btf
cpu: 13th Gen Intel(R) Core(TM) i7-1365U
                │  base.txt   │          lazy-strings.txt           │
                │   sec/op    │    sec/op     vs base               │
ParseVmlinux      27.24m ± 1%   15.49m ±  1%  -43.15% (p=0.002 n=6)
IterateVmlinux    149.9m ± 2%   132.3m ± 15%  -11.73% (p=0.041 n=6)
InspectorGadget   35.08m ± 2%   21.28m ±  5%  -39.34% (p=0.002 n=6)
geomean           52.32m        35.20m        -32.73%

                │   base.txt    │          lazy-strings.txt           │
                │     B/op      │     B/op      vs base               │
ParseVmlinux       9.969Mi ± 0%   4.960Mi ± 0%  -50.25% (p=0.002 n=6)
IterateVmlinux     34.72Mi ± 0%   31.92Mi ± 0%   -8.07% (p=0.002 n=6)
InspectorGadget   11.994Mi ± 0%   7.169Mi ± 0%  -40.23% (p=0.002 n=6)
geomean            16.07Mi        10.43Mi       -35.10%

                │   base.txt    │          lazy-strings.txt          │
                │   allocs/op   │  allocs/op   vs base               │
ParseVmlinux      146058.0 ± 0%    162.0 ± 0%  -99.89% (p=0.002 n=6)
IterateVmlinux      272.9k ± 0%   291.5k ± 0%   +6.83% (p=0.002 n=6)
InspectorGadget    155.03k ± 0%   24.30k ± 0%  -84.32% (p=0.002 n=6)
geomean             183.5k        10.47k       -94.29%

Signed-off-by: Lorenz Bauer <[email protected]>

The most common use case of a Spec is to look up a type by its name. For this purpose we maintain a map[essentialName][]TypeID. This requires allocating a string for each named type, which causes a very large overhead when parsing BTF. In reality, only a very small number of the named types will ever be looked up. The intuition here is that a couple of structs in the kernel contain most of the interesting information, for example struct sk_buff. Move as much of the cost of looking up a type by name to the actual lookup. Instead of spending a lot of time constructing an index up front we only maintaing an index going from the hash of a name to a type ID. 1. We can compute the hash on a byte slice and therefore avoid allocating a string. 2. Storing the index as a (hash, id) tuple allows us to store it in a slice. Lookups are just a binary search into the index. 3. Hash collisions do not introduce additional complexity because types can already share the same name. At the same time the common case of a 1:1 mapping from name to type is fast. Signed-off-by: Lorenz Bauer <[email protected]>

dylandreimerink

🚀 cool approach with the fuzzyStringIndex. Can't find anything wrong with this.

btf: lazy string table

3da296f

Signed-off-by: Lorenz Bauer <[email protected]>

lmb force-pushed the btf-lazy-strings branch from 601e57c to 7bf307e Compare May 1, 2025 14:41

lmb force-pushed the btf-lazy-strings branch from 7bf307e to d140dfc Compare May 1, 2025 14:58

lmb marked this pull request as ready for review May 1, 2025 15:01

lmb requested a review from dylandreimerink as a code owner May 1, 2025 15:01

dylandreimerink approved these changes May 2, 2025

View reviewed changes

lmb merged commit 1e9e58e into cilium:main May 2, 2025
17 checks passed

lmb deleted the btf-lazy-strings branch May 2, 2025 10:21

burak-ok mentioned this pull request May 30, 2025

deps: Update to go-ebpf version with lower memory usage inspektor-gadget/inspektor-gadget#4541

Merged

paulcacheux mentioned this pull request Jun 20, 2025

btf: introduce caching string table to speed up ext info loading #1809

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

btf: lazy decoding of string table#1772

btf: lazy decoding of string table#1772
lmb merged 2 commits intocilium:mainfrom
lmb:btf-lazy-strings

lmb commented May 1, 2025

Uh oh!

dylandreimerink left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Conversation

lmb commented May 1, 2025

Uh oh!

dylandreimerink left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants