Skip to content

Optimized binary_length_from_base64 functions for most kernels#944

Merged
lemire merged 8 commits intomasterfrom
dlemire/base64-length
Mar 7, 2026
Merged

Optimized binary_length_from_base64 functions for most kernels#944
lemire merged 8 commits intomasterfrom
dlemire/base64-length

Conversation

@lemire
Copy link
Copy Markdown
Member

@lemire lemire commented Mar 6, 2026

We have a function binary_length_from_base64 contributed by @anonrig, but it has not yet been optimized. In common cases, it can compute exactly the output size of base64 decoding. This can be convenient in some cases. (Note that it does not validate the content, deliberately.)

Create test file:

base64 -w72  README.md > test.base64

x64 Emerald rapids

./build/benchmarks/base64/benchmark_base64 -L  test.base64
# current system detected as icelake.
# loading files: .
# volume: 186321 bytes
# max length: 186321 bytes
# number of inputs: 1
# lengths
# Benchmark only simdutf length functions (maximal and exact)
simdutf::icelake_maximal_binary_length_from_base64 :  10152.97 GB/s  14.70 %  15.93 GHz   0.00 c/b   0.00 i/b   0.87 i/c
simdutf::icelake_binary_length_from_base64    :  70.87 GB/s  6.57 %   3.19 GHz   0.04 c/b   0.13 i/b   2.82 i/c
simdutf::haswell_maximal_binary_length_from_base64 :  10050.73 GB/s  15.86 %  15.74 GHz   0.00 c/b   0.00 i/b   0.87 i/c
simdutf::haswell_binary_length_from_base64    :  58.07 GB/s  2.83 %   3.33 GHz   0.06 c/b   0.25 i/b   4.39 i/c

x64 Ice Lake

 sudo ./build/benchmarks/base64/benchmark_base64 -L  test.base64
[sudo] password for dlemire:
# current system detected as icelake.
# loading files: .
# volume: 186321 bytes
# max length: 186321 bytes
# number of inputs: 1
# lengths
# Benchmark only simdutf length functions (maximal and exact)
simdutf::icelake_maximal_binary_length_from_base64 :  7488.24 GB/s  24.41 %  12.17 GHz   0.00 c/b   0.00 i/b   0.79 i/c
simdutf::icelake_binary_length_from_base64    :  44.97 GB/s  3.16 %   3.14 GHz   0.07 c/b   0.13 i/b   1.82 i/c
simdutf::haswell_maximal_binary_length_from_base64 :  8768.61 GB/s  6.24 %  14.27 GHz   0.00 c/b   0.00 i/b   0.79 i/c
simdutf::haswell_binary_length_from_base64    :  50.68 GB/s  1.31 %   3.24 GHz   0.06 c/b   0.25 i/b   3.93 i/c
simdutf::westmere_maximal_binary_length_from_base64 :  8770.23 GB/s  6.22 %  14.27 GHz   0.00 c/b   0.00 i/b   0.79 i/c
simdutf::westmere_binary_length_from_base64   :  28.77 GB/s  0.65 %   3.22 GHz   0.11 c/b   0.44 i/b   3.93 i/c

ARM (Apple M4)

 sudo ./build/benchmarks/base64/benchmark_base64 -L  test.base64
Password:
# current system detected as arm64.
# loading files: .
# volume: 183769 bytes
# max length: 183769 bytes
# number of inputs: 1
# lengths
# Benchmark only simdutf length functions (maximal and exact)
simdutf::arm64_maximal_binary_length_from_base64 :  11974.49 GB/s  inf %  187.29 GHz   0.02 c/b   0.08 i/b   4.94 i/c 
simdutf::arm64_binary_length_from_base64      :  73.13 GB/s  7.72 %   5.65 GHz   0.08 c/b   0.41 i/b   5.25 i/c

Loongson:

sudo ./build/benchmarks/base64/benchmark_base64 -L  test.base64
# current system detected as lasx.
# loading files: .
# volume: 186321 bytes
# max length: 186321 bytes
# number of inputs: 1
# lengths
# Benchmark only simdutf length functions (maximal and exact)
simdutf::lasx_maximal_binary_length_from_base64 :  8260.44 GB/s  12.78 %   6.82 GHz   0.00 c/b   0.00 i/b   1.42 i/c
simdutf::lasx_binary_length_from_base64       :  23.97 GB/s  8.73 %   2.51 GHz   0.10 c/b   0.47 i/b   4.49 i/c
simdutf::lsx_maximal_binary_length_from_base64 :  8259.58 GB/s  12.79 %   6.82 GHz   0.00 c/b   0.00 i/b   1.42 i/c
simdutf::lsx_binary_length_from_base64        :  23.92 GB/s  13.39 %   2.51 GHz   0.10 c/b   0.44 i/b   4.19 i/c

The LASX results are disappointing. But it is much faster than the decoding performance so it should be ok.

simdutf::lasx                                 :   4.88 GB/s  0.90 %   2.50 GHz   0.51 c/b   2.41 i/b   4.69 i/c
simdutf::lsx                                  :   3.09 GB/s  1.17 %   2.50 GHz   0.81 c/b   4.15 i/b   5.12 i/c

This is join work with @erikcorry (credit in the commits)

@lemire lemire merged commit 9bfb9be into master Mar 7, 2026
108 checks passed
@pauldreik
Copy link
Copy Markdown
Collaborator

incredible speeds!
I ran the benchmark in the same manner as described above on an AMD 7950X3D and got over 100 GB/s :

simdutf/build/release/benchmarks/base64/benchmark_base64 -L  test.base64
# current system detected as icelake.
# loading files: .
# volume: 186321 bytes
# max length: 186321 bytes
# number of inputs: 1
# lengths
# Benchmark only simdutf length functions (maximal and exact)
simdutf::icelake_maximal_binary_length_from_base64 :  552.22 GB/s  28.78 % 
simdutf::icelake_binary_length_from_base64    :  111.13 GB/s  8.73 % 
simdutf::haswell_maximal_binary_length_from_base64 :  552.61 GB/s  7.72 % 
simdutf::haswell_binary_length_from_base64    :  60.02 GB/s  7.39 % 
simdutf::westmere_maximal_binary_length_from_base64 :  551.44 GB/s  8.29 % 
simdutf::westmere_binary_length_from_base64   :  49.72 GB/s  9.56 % 
simdutf::fallback_maximal_binary_length_from_base64 :  552.34 GB/s  80.39 % 
simdutf::fallback_binary_length_from_base64   :  11.06 GB/s  3.28 % 

@lemire
Copy link
Copy Markdown
Member Author

lemire commented Mar 9, 2026

@pauldreik Yes, so I think that's a good addition.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants