A pure MoonBit compression library supporting DEFLATE, gzip, zlib, LZW, bzip2, Brotli, Zstandard, and LZ4. Targets native (Linux, Windows, macOS), JavaScript, and WebAssembly.
- Pure MoonBit — no FFI required (optional native acceleration for blit/checksum)
- Multi-target: native, js, and wasm-gc backends
- Dynamic Huffman coding with optimal fixed/dynamic block selection
- Level-differentiated compression: fast greedy (1-3), lazy matching (4-9)
- SA-IS suffix array construction for O(n) bzip2 BWT
- Hardware-accelerated CRC-32 (PCLMULQDQ) and Adler-32 (SSSE3) on native, software fallback elsewhere
- Two-level Huffman table decompression with zero-copy direct output
- BytesView-based streaming API — zero-copy input slicing
- Signal protocol streaming — no callbacks, no trait objects, explicit control flow
- Async streaming for DEFLATE via MoonBit's
async/io - Cross-validated against Go's
compress/*stdlib where applicable, plus external golden vectors for additional formats
| Package | Description |
|---|---|
bikallem/compress/flate |
DEFLATE compression/decompression (RFC 1951) |
bikallem/compress/flate/async |
Async DEFLATE streaming via @io.Reader/@io.Writer |
bikallem/compress/gzip |
gzip format (RFC 1952) |
bikallem/compress/zlib |
zlib format (RFC 1950) |
bikallem/compress/lzw |
Lempel-Ziv-Welch (GIF/TIFF/PDF) |
bikallem/compress/brotli |
Brotli compression/decompression (RFC 7932) |
bikallem/compress/bzip2 |
bzip2 compression/decompression |
bikallem/compress/zstd |
Zstandard frame compression/decompression with dictionary support (subset encoder) |
bikallem/compress/lz4 |
LZ4 frame compression/decompression |
bikallem/compress/snappy |
Snappy raw and framed compression/decompression |
bikallem/compress/checksum |
CRC-32 and Adler-32 checksums |
moon add bikallem/compress
Every package provides one-shot compress/decompress functions for simple use cases:
// DEFLATE (level defaults to DefaultCompression)
let compressed = @flate.compress(data)
let compressed = @flate.compress(data, level=BestSpeed)
let decompressed = @flate.decompress(compressed)
// gzip
let compressed = @gzip.compress(data)
let compressed = @gzip.compress(data, level=BestCompression, header={ name: "data.txt", ..Header::default() })
let (decompressed, header) = @gzip.decompress(compressed)
// zlib (supports preset dictionaries)
let compressed = @zlib.compress(data)
let compressed = @zlib.compress(data, dict=my_dict, level=BestSpeed)
let decompressed = @zlib.decompress(compressed)
// LZW
let compressed = @lzw.compress(data, LSB, 8)
let decompressed = @lzw.decompress(compressed, LSB, 8)
// Brotli (level 0-11)
let compressed = @brotli.compress(data)
let compressed = @brotli.compress(data, level=Level(1))
let decompressed = @brotli.decompress(compressed)
// bzip2 (level 1-9, controls block size)
let compressed = @bzip2.compress(data)
let compressed = @bzip2.compress(data, level=9)
let decompressed = @bzip2.decompress(compressed)
// Zstandard
let compressed = @zstd.compress(data)
let compressed = @zstd.compress(data, level=Fast)
let compressed = @zstd.compress(data, dict=my_zstd_dict)
let decompressed = @zstd.decompress(compressed)
let decompressed = @zstd.decompress(compressed, dict=my_zstd_dict)
let decompressed = @zstd.decompress_with_dictionaries(combined_zstd_frames, [dict_a, dict_b])
// LZ4
let compressed = @lz4.compress(data)
let decompressed = @lz4.decompress(compressed)
let custom_lz4_dict = @lz4.Dictionary::new(my_lz4_dict, dict_id=0x12345678U)
let decompressed = @lz4.decompress_with_dictionary(compressed, custom_lz4_dict)
let decompressed = @lz4.decompress_with_dictionaries(concatenated_lz4_frames, [
@lz4.Dictionary::new(dict_a),
@lz4.Dictionary::new(dict_b),
])
// Snappy
let compressed = @snappy.compress(data)
let decompressed = @snappy.decompress(compressed)
// Checksums
let crc = @checksum.crc32(data[:])
let adler = @checksum.adler32(data[:])All packages provide Deflater (compressor) and Inflater (decompressor) types with a signal-protocol interface. flate, gzip, zlib, lzw, bzip2, brotli, lz4, and zstd stream incrementally. Raw snappy decompression is incremental too, but raw Snappy compression can only stream incrementally when the caller knows the final uncompressed size up front because the format starts with that length varint. Use @snappy.Deflater::new_known_length(data.length()) for raw incremental output, @snappy.Deflater::new() / @snappy.Deflater::new_buffered() for buffered raw compatibility mode, or @snappy.FramedDeflater::new() / @snappy.compress_framed(...) when you need unknown-length streaming output.
Feed data with encode(Some(chunk[:])), finalize with encode(None):
let d = @flate.Deflater::new(level=BestSpeed)
match d.encode(Some(data[:])) {
Ok => () // input buffered, no output yet
Data(out) => ... // compressed output ready
End => ... // shouldn't happen mid-stream
Error(e) => ... // compression error
}
loop d.encode(None) {
Data(out) => { write(out); continue d.encode(None) }
End => break
Ok | Error(_) => break
}Feed compressed data with src(chunk[:]), pull output with decode():
let d = @flate.Inflater::new()
d.src(compressed_chunk[:])
loop d.decode() {
Await => { d.src(next_chunk[:]); continue d.decode() }
Data(out) => { write(out); continue d.decode() }
End => break
Error(e) => ...
}gzip and zlib deflaters/inflaters handle headers, checksums, and trailers automatically:
// gzip with custom header
let d = @gzip.Deflater::new(header={ name: "data.txt", ..Header::default() })
// Access the header after decompression
let header = inflater.header()
// zlib with preset dictionary
let d = @zlib.Deflater::new(dict=my_dict)
let i = @zlib.Inflater::new(dict=my_dict)
// LZW with bit order and literal width
let d = @lzw.Deflater::new(MSB, 8)
let i = @lzw.Inflater::new(MSB, 8)
// Brotli
let d = @brotli.Deflater::new(level=Level(6))
let i = @brotli.Inflater::new()
// bzip2
let d = @bzip2.Deflater::new(level=9)
let i = @bzip2.Inflater::new()
// Zstandard
let d = @zstd.Deflater::new(level=Fast, dict=my_zstd_dict)
let i = @zstd.Inflater::new(dict=my_zstd_dict)
let i_many = @zstd.Inflater::new_with_dictionaries([dict_a, dict_b])
// Snappy raw stream
let d = @snappy.Deflater::new_known_length(data.length())
let buffered = @snappy.Deflater::new()
let i = @snappy.Inflater::new()
let framed = @snappy.FramedDeflater::new()
let framed_bytes = @snappy.compress_framed(data)
let framed_plain = @snappy.decompress_framed(framed_bytes)
// LZ4 with configurable frame flags
let d = @lz4.Deflater::new(dict=my_lz4_dict, options={
block_independence: false,
block_checksum: true,
block_max_size: Size256KB,
dict_id: 0x12345678U,
..FrameOptions::default()
})
let d_with_size = @lz4.Deflater::new_with_content_size(data.length(), dict=my_lz4_dict, options={
block_max_size: Size256KB,
..FrameOptions::default()
})
let i = @lz4.Inflater::new(dict=my_lz4_dict)
let i_with_id = @lz4.Inflater::new_with_dictionary(
@lz4.Dictionary::new(my_lz4_dict, dict_id=0x12345678U),
)
let i_many = @lz4.Inflater::new_with_dictionaries([
@lz4.Dictionary::new(dict_a),
@lz4.Dictionary::new(dict_b),
])
// Get remaining unprocessed input after decompression
let leftover = inflater.remaining()For gzip, bzip2, and lz4 streaming inflaters, call finish() once the upstream source reaches EOF. That lets the wrapper distinguish a true end-of-input from an exact boundary between concatenated members/streams.
For LZ4, dictionary bytes and dict_id metadata move together: if you pass dictionary bytes and leave dict_id = 0, the encoder derives a deterministic nonzero id from the dictionary prefix; if you do not pass dictionary bytes, any configured dict_id is suppressed. The raw-byte decode helpers derive and validate that same id automatically; if you need to decode frames that use an explicit custom dict_id, use @lz4.Dictionary::new(...) together with @lz4.decompress_with_dictionary(...) or @lz4.Inflater::new_with_dictionary(...). Concatenated LZ4 streams can use different external dictionaries per frame through @lz4.decompress_with_dictionaries(...) and @lz4.Inflater::new_with_dictionaries(...).
For Zstandard, one-shot and streaming decode can likewise select dictionaries per concatenated frame with @zstd.decompress_with_dictionaries(...) and @zstd.Inflater::new_with_dictionaries(...). Formatted dictionaries are matched by frame dict_id; dict_id = 0 frames consume raw dictionary slots in the order supplied, except that a single non-empty raw dictionary is reused across all such frames for compatibility. Empty byte slices act as one-shot plain-frame placeholders when mixing raw-dictionary and plain frames. The single-dictionary decode path now rejects malformed formatted dictionaries instead of silently treating them as raw-content dictionaries.
The flate/async package provides async wrappers that work with MoonBit's @io.Reader and @io.Writer interfaces:
// Async DEFLATE compression
async fn compress_stream(reader : &@io.Reader, writer : &@io.Writer) -> Unit {
@flate.async.compress(reader, writer, level=BestSpeed)
}
// Async DEFLATE decompression
async fn decompress_stream(reader : &@io.Reader, writer : &@io.Writer) -> Unit {
@flate.async.decompress(reader, writer)
}DEFLATE, gzip, and zlib support compression levels via @flate.CompressionLevel:
| Level | Description |
|---|---|
NoCompression |
Store blocks only (level 0) |
BestSpeed |
Fastest compression (level 1) |
Level(2..8) |
Trade-off between speed and ratio |
BestCompression |
Smallest output (level 9) |
DefaultCompression |
Balanced default (level 6) |
HuffmanOnly |
Huffman encoding, no LZ77 matching |
bzip2 uses its own level parameter (1-9), controlling block size (N x 100KB).
Brotli uses @brotli.CompressionLevel: Level(0) through Level(11), Default (level 6), or Best (level 11). Higher levels use longer hash chains for better compression ratios.
Zstandard uses @zstd.CompressionLevel: Fast, Default, Best, or Level(Int). The encoder maps these to progressively deeper match-finding tiers: Fast scans only the newest hash hit, Default adds a light lazy-match pass plus more recent hash candidates, and Best now also uses a shallow archived-candidate tier with a longer lazy/nice-match budget. Level(Int) picks finer-grained settings across the same spectrum, with the highest levels consulting progressively larger archived queues of older hash candidates after the active recent-candidate set, using an interleaved recent-plus-older search order, archive-prefix deduplication, and a small archive-aware lazy-match bias so strong older anchors are not discarded for marginal one-byte-later gains.
Zstandard status: The current codec supports raw, RLE, Huffman-compressed, and treeless literals plus predefined, RLE, repeat, and custom FSE sequence tables on decode. Raw-content and formatted dictionaries are supported for decode, compression can emit dictionary IDs for formatted dictionaries, generate custom FSE sequence tables, emit both direct-weight and FSE-compressed custom Huffman literal sections, and choose RLE blocks for repeated literal-only input even when dictionaries are present. Large repeated one-shot inputs are now emitted as spec-sized RLE blocks instead of a single oversized block, Best now uses a shallow archive tier by default, and high levels now treat very sparse accidental matches as a literal-only entropy problem so smaller block targets can beat a single max-sized literal block. The Deflater/Inflater wrappers process frames incrementally while skipping skippable frames, malformed formatted dictionaries now fail closed on the single-dictionary decode path, and concatenated decode can now select formatted dictionaries by dict_id plus ordered raw-dictionary slots for dict_id = 0 frames, with the legacy single-non-empty-raw-dictionary reuse path still supported. Compression now exposes a broader search matrix than the original fast/default/best split, but it is still a valid subset encoder rather than full parity with upstream zstd's full entropy tuning / strategy matrix.
Brotli features: The decoder is fully RFC 7932 compliant, including the 122KB static dictionary with 121 word transforms. The encoder supports context modeling (level 5+), heuristic metablock splitting for larger inputs, deeper level-10/11 hash-chain search, and static-dictionary search across the RFC transform table, including exact words plus affix/case/omit variants such as " Function " -> "function" and "functio" -> "function" dictionary references. Large inputs that span multiple metablocks now preserve the requested Brotli level across chunk boundaries instead of falling back to the simplified fast path; split probing now includes larger candidates such as 96KiB, fifth-based candidates for five-way heterogeneous layouts, a context-aware boundary scorer that prefers blank lines and structured closing/opening punctuation transitions, and global-fraction split schedules that avoid repeated-target drift after an early boundary shift. The main remaining encoder gap is broader block-splitting/tuning parity with upstream Brotli heuristics rather than transform coverage. The Deflater/Inflater wrappers stream incrementally over a single Brotli bitstream and carry rolling history/context across chunk boundaries, though chunking can still affect metablock boundaries and final compression ratio versus one-shot compress(). Output is verified against Go's andybalholm/brotli reference decoder.
Stateful hashers implement the Hasher trait for incremental updates:
let h = @checksum.CRC32::new()
h.update(chunk1[:])
h.update(chunk2[:])
let result = h.checksum()Benchmarked on the native backend against Go's standard library (v0.1.2). Ratio < 1 means MoonBit is faster.
Run benchmarks: ./tools/bench.sh --go
| Benchmark | MoonBit | Go | Ratio |
|---|---|---|---|
| compress 1 KB | 16 µs | 67 µs | 0.24x |
| compress 10 KB | 63 µs | 124 µs | 0.51x |
| compress 100 KB | 570 µs | 298 µs | 1.91x |
| compress 1 MB | 6.2 ms | 2.0 ms | 3.06x |
| compress speed 1 KB | 12 µs | 134 µs | 0.09x |
| compress speed 10 KB | 15 µs | 125 µs | 0.12x |
| decompress 1 KB | 0.82 µs | 4.0 µs | 0.21x |
| decompress 10 KB | 5.1 µs | 10.7 µs | 0.47x |
| decompress 100 KB | 29 µs | 58 µs | 0.50x |
| decompress 1 MB | 305 µs | 915 µs | 0.33x |
| decompress 10 MB | 4.8 ms | 10.5 ms | 0.46x |
Decompression is 2-5x faster than Go at all sizes. BestSpeed compression is 8-11x faster. Default compression is faster up to 10 KB; at larger sizes Go's more aggressive match-finding gives it an edge.
| Benchmark | MoonBit | Go | Ratio |
|---|---|---|---|
| compress 1 KB | 54 µs | 754 µs | 0.07x |
| compress 10 KB | 500 µs | 2,047 µs | 0.24x |
| compress 100 KB | 5.4 ms | 10.2 ms | 0.53x |
| compress 1 MB | 107 ms | 112 ms | 0.95x |
| decompress 1 KB | 123 µs | 420 µs | 0.29x |
| decompress 10 KB | 167 µs | 541 µs | 0.31x |
| decompress 100 KB | 680 µs | 1,225 µs | 0.55x |
| decompress 1 MB | 6.0 ms | 7.2 ms | 0.83x |
bzip2 uses SA-IS (O(n) suffix array construction) for the Burrows-Wheeler Transform. Go's benchmark uses the system bzip2 binary (C) for compression and Go's compress/bzip2 for decompression.
| Benchmark | MoonBit | Go | Ratio |
|---|---|---|---|
| compress 1 KB | 7.4 µs | 8.6 µs | 0.86x |
| compress 10 KB | 41 µs | 42 µs | 0.98x |
| compress 100 KB | 424 µs | 427 µs | 0.99x |
| compress 1 MB | 4.6 ms | 4.3 ms | 1.05x |
| decompress 1 KB | 3.7 µs | 4.7 µs | 0.78x |
| decompress 10 KB | 16 µs | 26 µs | 0.63x |
| decompress 100 KB | 140 µs | 244 µs | 0.57x |
| decompress 1 MB | 1.6 ms | 2.7 ms | 0.57x |
LZW compression is at parity with Go. Decompression is 1.3-1.8x faster.
Apache-2.0