[PoC] Introduce new flag `SpecCacheDisabled` & Parse only the requires BTF types by burak-ok · Pull Request #1755 · cilium/ebpf

burak-ok · 2025-04-16T09:12:12Z

Based on #1589

This is a proof-of-concept. The code is not ready for merging but it shows it is possible to significantly reduce the memory consumption (by 27MB).

This PR aims to lower the memory footprint when using the cilium/ebpf library. This is achieved in two ways:

Reducing memory usage: Adding a new flag which disables caching of the BTF spec
Parsing only the needed BTF types

The tradeoff is for lowering the memory footprint is of course performance while loading eBPF programs etc...

In this PoC when using the new SpecCacheDisabled flag (1) it will also automatically check which BTF types are needed and load/parse only those (2)

Benchmarks:

ParseVmlinux is always the base for both tests
The new.txt Inspektor gadget run is with the types from [PoC] Introduce new flag SpecCacheDisabled & Parse only the requires BTF types #1755
The new.txt ParseVmlinux run is called BenchmarkParseVmlinuxWithoutFilter in this PR

goos: linux
goarch: amd64
pkg: github.com/cilium/ebpf/btf
cpu: 11th Gen Intel(R) Core(TM) i7-11370H @ 3.30GHz
                     │   old.txt   │               new.txt                │
                     │   sec/op    │   sec/op     vs base                 │
ParseVmlinux-8         29.28m ± 3%   60.62m ± 8%  +107.06% (p=0.000 n=10)
InspektorGadget-8      29.28m ± 3%   12.94m ± 3%   -55.79% (p=0.000 n=10)
geomean                29.28m        28.01m         -4.32%

                     │   old.txt    │                new.txt                │
                     │     B/op     │     B/op      vs base                 │
ParseVmlinux-8         24.43Mi ± 0%   53.58Mi ± 0%  +119.37% (p=0.000 n=10)
InspektorGadget-8      24.43Mi ± 0%   10.89Mi ± 0%   -55.40% (p=0.000 n=10)
geomean                24.43Mi        24.16Mi         -1.08%

                     │   old.txt   │               new.txt               │
                     │  allocs/op  │  allocs/op   vs base                │
ParseVmlinux-8         271.9k ± 0%   365.1k ± 0%  +34.30% (p=0.000 n=10)
InspektorGadget-8      271.9k ± 0%   110.1k ± 0%  -59.51% (p=0.000 n=10)
geomean                271.9k        200.5k       -26.26%

This flag disables the caching of the BTF which reduces the memory footprint. Furthermore this also only parses the needed symbols out of the BTF instead of reading and interpreting everything. Signed-off-by: Burak Ok <[email protected]> Co-authored-by: Alban Crequy <[email protected]>

lmb · 2025-04-16T15:19:50Z

Seems like vmlinux spec keeps being a problem! Just checking: when you say memory usage you mean heap at idle? Calling https://pkg.go.dev/github.com/cilium/ebpf/btf#FlushKernelSpec does not help?

How would you determine which types to parse from vmlinux?

burak-ok · 2025-04-17T11:18:17Z

Seems like vmlinux spec keeps being a problem! Just checking: when you say memory usage you mean heap at idle? Calling https://pkg.go.dev/github.com/cilium/ebpf/btf#FlushKernelSpec does not help?

Yes, that would help for having a lower heap usage after starting the program. But if one sets a low memory limit in a pod spec, one also needs to avoid high memory while initializing -> while loading every program.

Furthermore with FlushKernelSpec we need to find and identify every possible call into the ebpf library which might parse and save the while KernelSpec. With a flag we can set the flag and forget about flushing the kernel spec.

How would you determine which types to parse from vmlinux?

For that we are reading the all relo.TypeNames out of the program where the relocations are getting applied: https://github.com/cilium/ebpf/pull/1755/files#diff-981ef293a9c93614e843135eb5b207f951babe5604aee33c72f66567ccaa01de

ti-mo · 2025-04-17T12:04:01Z

@lmb To me it sounds like lazy-decoding could've been a better avenue to explore after all?

Not sure how (in)feasible it is today, but iirc we had btf.Spec.Add() back in the day which was a blocker. Now we have btf.Builder, we could technically, hypothetically, make btf.Spec a querying layer over an encoded btf blob and only inflate what's queried, and cache the results to enable type comparisons. Or, implement comparers on all types if we don't want to cache anything. Seems like some (most?) users care more about keeping both resident and peak memory usage low rather than speed.

lmb · 2025-04-21T09:52:58Z

My main concern is / was complexity of a lazy decoder. The whole "fixups" concept needs to be redone... I think you are right that peak usage seems more important.

I see two avenues: there is some perf to be gained by not unmarshaling into an interface for rawType, I think. Two is the lazy decode you mentioned. I still have some old proof of concepts lying around, I'll push those somewhere.

lmb · 2025-04-21T15:57:41Z

@burak-ok could you come up with a list of types which you most frequently need from vmlinux? That way we can add a benchmark we can start optimising against. Right now that benchmark is decoding all of vmlinux which isn't useful.

burak-ok · 2025-04-22T10:43:54Z

I hope the following helps:

Here is a list from a single program which gets loaded every time for Inspektor Gadget:

syscall_trace_enter
task_struct
nsproxy
mnt_namespace

A list of from multiple programs combined which gets loaded every time for Inspektor Gadget:

pt_regs
file
inode
super_block
socket
syscall_trace_enter
task_struct
nsproxy
mnt_namespace
fanotify_event
pid
trace_event_raw_sched_process_exec

And another list from multiple programs, which get loaded every time and 4 gadgets(trace_tcp, trace_dns, trace_exec, top_file):

pt_regs
file
inode
super_block
socket
syscall_trace_enter
task_struct
nsproxy
mnt_namespace
fanotify_event
pid
trace_event_raw_sched_process_exec
fs_struct
path
mount
qstr
vfsmount
dentry
bpf_func_id
mm_struct
syscall_trace_exit
linux_binprm
sock
net
inet_sock

Add a benchmark which replicates the types used by Inspektor Gadget for a common confiuration. See cilium#1755 (comment) Signed-off-by: Lorenz Bauer <[email protected]>

lmb · 2025-04-23T17:12:42Z

@burak-ok can you take a look at #1763? Doesn't address FlushKernelSpec, but I think that might be better done by relying on weak in Go 1.24.

burak-ok · 2025-04-25T10:23:59Z

@burak-ok can you take a look at #1763?

I added roughly the same benchmarks that you posted for this PR in the top post.
Comparing these results your PR saves roughly the same amount of memory and allocations for the InspektorGadget benchmark.
But the ParseVmlinux Benchmark shows that my Draft PR has more disadvantages for the general usecase of reading and parsing every type.

I'll take a deeper look into your PR some time later, thanks for opening it.

Doesn't address FlushKernelSpec, but I think that might be better done by relying on weak in Go 1.24.

With having memory limits I think this would be the best case scenario for us.

Add a benchmark which replicates the types used by Inspektor Gadget for a common configuration. Also add a benchmark which explicitly iterates all types in vmlinux, which is similar to what pwru does. See cilium#1755 (comment) Signed-off-by: Lorenz Bauer <[email protected]>

Add a benchmark which replicates the types used by Inspektor Gadget for a common configuration. Also add a benchmark which explicitly iterates all types in vmlinux, which is similar to what pwru does. See #1755 (comment) Signed-off-by: Lorenz Bauer <[email protected]>

lmb · 2025-05-07T10:26:10Z

The new lazy BTF code is in. @burak-ok could you try the code and report back how much of a difference it makes?

This version introduces a lot of improvements. It's worth mentioning this one cilium/ebpf#1755, which reduces memory usage by ~25MB. It also bumps go version to 1.23.

lmb mentioned this pull request Apr 23, 2025

btf: lazy decoding #1763

Merged

lmb closed this May 7, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

[PoC] Introduce new flag `SpecCacheDisabled` & Parse only the requires BTF types#1755

[PoC] Introduce new flag `SpecCacheDisabled` & Parse only the requires BTF types#1755
burak-ok wants to merge 1 commit intocilium:mainfrom
inspektor-gadget:burak/btf_filter

burak-ok commented Apr 16, 2025 •

edited

Loading

Uh oh!

lmb commented Apr 16, 2025

Uh oh!

burak-ok commented Apr 17, 2025

Uh oh!

ti-mo commented Apr 17, 2025

Uh oh!

lmb commented Apr 21, 2025

Uh oh!

lmb commented Apr 21, 2025

Uh oh!

burak-ok commented Apr 22, 2025

Uh oh!

lmb commented Apr 23, 2025

Uh oh!

burak-ok commented Apr 25, 2025

Uh oh!

lmb commented May 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Comments

Conversation

burak-ok commented Apr 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lmb commented Apr 16, 2025

Uh oh!

burak-ok commented Apr 17, 2025

Uh oh!

ti-mo commented Apr 17, 2025

Uh oh!

lmb commented Apr 21, 2025

Uh oh!

lmb commented Apr 21, 2025

Uh oh!

burak-ok commented Apr 22, 2025

Uh oh!

lmb commented Apr 23, 2025

Uh oh!

burak-ok commented Apr 25, 2025

Uh oh!

lmb commented May 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

burak-ok commented Apr 16, 2025 •

edited

Loading