You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is an implementation of a base64 stream encoding/decoding library in C99
6
-
with SIMD (AVX2, NEON, AArch64/NEON, SSSE3, SSE4.1, SSE4.2, AVX) and
6
+
with SIMD (AVX2, AVX512, NEON, AArch64/NEON, SSSE3, SSE4.1, SSE4.2, AVX) and
7
7
[OpenMP](http://www.openmp.org) acceleration. It also contains wrapper functions
8
8
to encode/decode simple length-delimited strings. This library aims to be:
9
9
@@ -19,6 +19,10 @@ will pick an optimized codec that lets it encode/decode 12 or 24 bytes at a
19
19
time, which gives a speedup of four or more times compared to the "plain"
20
20
bytewise codec.
21
21
22
+
AVX512 support is only for encoding at present, utilizing the AVX512 VL and VBMI
23
+
instructions. Decoding part reused AVX2 implementations. For CPUs later than
24
+
Cannonlake (manufactured in 2018) supports these instructions.
25
+
22
26
NEON support is hardcoded to on or off at compile time, because portable
23
27
runtime feature detection is unavailable on ARM.
24
28
@@ -59,6 +63,9 @@ optimizations described by Wojciech Muła in a
The OpenMP implementation was added by Ferry Toth (@htot) from [Exalon Delft](http://www.exalondelft.nl).
63
70
64
71
## Building
@@ -76,8 +83,8 @@ To compile just the "plain" library without SIMD codecs, type:
76
83
make lib/libbase64.o
77
84
```
78
85
79
-
Optional SIMD codecs can be included by specifying the `AVX2_CFLAGS`, `NEON32_CFLAGS`, `NEON64_CFLAGS`,
A typical build invocation on x86 looks like this:
82
89
83
90
```sh
@@ -93,6 +100,15 @@ Example:
93
100
AVX2_CFLAGS=-mavx2 make
94
101
```
95
102
103
+
### AVX512
104
+
105
+
To build and include the AVX512 codec, set the `AVX512_CFLAGS` environment variable to a value that will turn on AVX512 support in your compiler, typically `-mavx512vl -mavx512vbmi`.
106
+
Example:
107
+
108
+
```sh
109
+
AVX512_CFLAGS="-mavx512vl -mavx512vbmi" make
110
+
```
111
+
96
112
The codec will only be used if runtime feature detection shows that the target machine supports AVX2.
97
113
98
114
### SSSE3
@@ -208,6 +224,7 @@ Mainly there for testing purposes, this is also useful on ARM where the only way
0 commit comments