btf: use append-style encoding when marshaling#1761
Conversation
|
Overall I'm a bit confused by the outcome of this PR. I'd have thought that dropping allocations this much would give a good boost to runtime, but it almost has a negative effect. Can't figure it out. Maybe one of you has an idea? |
My guess is that its mostly about the type of allocation we are eliminating. The The Even though improvement is slight, I do like this a lot more 👍 |
The Builder currently uses binary.Write with a shared
bytes.Buffer. Replace this with binary.Append and a byte
slice.
This drops the number of allocations by 99% and gives a small
speed up in benchmarks. The overall amount of memory allocated
increases a little bit, but I can't figure out why.
The []byte is threaded through the encoder.deflate* methods because
assigning to a shared buffer in encoder incurs write barrier
overheads.
core: 1
goos: linux
goarch: amd64
pkg: github.com/cilium/ebpf/btf
cpu: 13th Gen Intel(R) Core(TM) i7-1365U
│ base.txt │ append3.txt │
│ sec/op │ sec/op vs base │
Marshaler 11.31m ± 1% 11.41m ± 2% +0.93% (p=0.041 n=6)
BuildVmlinux 140.7m ± 4% 145.6m ± 1% +3.50% (p=0.002 n=6)
geomean 39.89m 40.77m +2.21%
│ base.txt │ append3.txt │
│ B/op │ B/op vs base │
Marshaler 3.286Mi ± 0% 3.469Mi ± 0% +5.58% (p=0.002 n=6)
BuildVmlinux 37.31Mi ± 0% 42.22Mi ± 0% +13.16% (p=0.002 n=6)
geomean 11.07Mi 12.10Mi +9.31%
│ base.txt │ append3.txt │
│ allocs/op │ allocs/op vs base │
Marshaler 12627.0 ± 0% 208.0 ± 0% -98.35% (p=0.002 n=6)
BuildVmlinux 165.689k ± 0% 1.508k ± 0% -99.09% (p=0.002 n=6)
geomean 45.74k 560.1 -98.78%
Signed-off-by: Lorenz Bauer <[email protected]>
The Builder currently uses binary.Write with a shared bytes.Buffer. Replace this with binary.Append and a byte slice.
This drops the number of allocations by 99% and gives a small speed up in benchmarks. The overall amount of memory allocated increases a little bit, but I can't figure out why.
The []byte is threaded through the encoder.deflate* methods because assigning to a shared buffer in encoder incurs write barrier overheads.