Releases · cactus-compute/cactus

@lennartvoelz

What's Changed

Fix/issue#490 by @lennartvoelz in #491
simplify and align sdks by @jakmro in #489
remove models by @jakmro in #492
Update model configurations and enhance workflow settings in publish_… by @jakmro in #495
Update workflow to use macos-latest instead of macos-latest-xlarge by @jakmro in #496
Add dynamic max_tokens estimation based on audio length in cactus_tra… by @jakmro in #499
macOS: link clang_rt.osx to fix SME2 (_arm_tpidr2*) link failures under rustc by @yujonglee in #498
Add FFI log control: cactus_log_set_level and cactus_log_set_callback by @yujonglee in #497
Karen/qwen3p5 by @kar-m in #481
CLI upgrades by @rshemet in #504
feat(stt): custom vocabulary biasing for all speech models by @vyomshah05 in #451
Add Gemma 3N (text-only) model support by @ncylich in #493
fix: make FunctionGemma prompt formatting strict by @lennartvoelz in #502
fix: apply logit bias before greedy sampling by @ncylich in #507
remove redundant file linking for tie_word_embeddings by @jakmro in #506
Port general engine improvements for TinyLlama by @ncylich in #513
Speech-to-Text Timestamps by @jakmro in #515

New Contributors

@lennartvoelz made their first contribution in #491

Full Changelog: v1.10...v1.11

@jakmro

What's Changed

Enhance model publishing workflow with detailed metadata and licenses by @jakmro in #459
Added parakeet to publish to hf yaml by @ParkiratS in #464
Update telemetry for supported platforms by @justinl66 in #465
added back moe weight conversion by @kar-m in #468
adjust manual workflow for model publish by @jakmro in #470
Parakeet blog by @ammesatyajit in #467
perf: add FP16 fast path for LayerNorm by @yujonglee in #433
Issue #406: Bilinear + Depthwise Optimizations by @PiyawanChaiprasit2006 in #466
ARM SME2: Accelerate MatMul FP16 by @aarav18 in #457
build: add Objective-C ARC support for NPU sources by @jakmro in #475
long transcription by @jakmro in #482
Language detection by @ParkiratS in #471
Parakeet tdt by @ParkiratS in #476
kotlin: expose forceTools in CompletionOptions by @rshemet in #484
Update model list in README and publish_to_hf.yml with new LiquidAI m… by @jakmro in #487
test: updated rag test conditions by @nshejwalkar in #488
optimize scale correction in cactus_attention_f16_h64 by @jakmro in #485
fix greedy sampler ignoring logit suppression by @jakmro in #486

New Contributors

@PiyawanChaiprasit2006 made their first contribution in #466
@aarav18 made their first contribution in #457

Full Changelog: v1.9...v1.10

@yujonglee

Whats New

50% faster int4
Parakeet models
LFM2-MOE models
BugFixes
Hybrid Inference

PRs

fix stt test and add cpp ci by @yujonglee in #413
add IRFFT by @yujonglee in #425
fixed lfm2 vlm lmhead issue that came in with hf 5.0.0 by @kar-m in #426
raspberry pi numebrs and linux fixes by @kar-m in #437
Added parakeet model by @ParkiratS in #443
Adding parakeet graph by @ParkiratS in #446
Parakeet kernel by @ParkiratS in #445
added cloud fallback and documentation+tests by @kar-m in #369
Parakeet FFI by @ParkiratS in #447
Parakeet convert and tests by @ParkiratS in #444
Hybrid transcription blog post by @rshemet in #449
Fixed missing engine changes by @ParkiratS in #453
feat(python): add context manager support for safe resource cleanup by @yogyam in #412
Completed ubuntu CICD pipeline by @ncylich in #455
Tie-embed-conversion-fix by @ncylich in #454
tiny graph fix and added benchmark by @kar-m in #456

Full Changelog: v1.8...v1.9

Breaking changes

Weights unfortunately need to be refreshed for this :(

@HenryNdubuaku

What's Changed

Kernel optimisations by @HenryNdubuaku in #397
Improve INT4 by @ncylich and @jrajala6 in #343
add einops dependency to requirements by @jakmro in #371
Add language parameter support for Whisper transcription by @rshemet in #384
added moe support for lfm by @kar-m in #374
Add raw FFI binding for Rust by @yujonglee in #382
fix: handle spaces in paths when running shell commands by @adithya-n05 in #377
fixing sentencepiece detection for transformers 5.0+ (still backwards compatible) by @ncylich in #373
Improve Telemetry by @mhayes853 in #372
proprietry commit by @HenryNdubuaku
Update performance metrics for iPhone 13 Mini and Galaxy A56 by @jakmro in #386
fix: improve version sorting and enhance model export tagging by @jakmro in #387
Add Rust SDK and language parameter documentation by @rshemet in #389
Basic addition of int4 functionality by @jrajala6 in #343
add scalar log by @yujonglee in #390
fix assertion and linux build in rust test by @yujonglee in #392
Justin/api fixes by @justinl66 in #380
Update telemetry by @justinl66 in #394
docs: add compatibility guidelines for runtime and weights by @jakmro in #398
add STFT_COMPLEX, derive stft_magnitude via graph composition by @yujonglee in #395

New Contributors

@yujonglee made their first contribution in #382
@adithya-n05 made their first contribution in #377

Full Changelog: v1.7...v1.8

Note:
This breaks the weights.

@HenryNdubuaku

What's Changed

Brew setup @HenryNdubuaku
Cactus auth @HenryNdubuaku
Hybrid inference by the cactus team
Karen/vlm fix by @kar-m in #311
fixed moonshine state resetting and gemma3 4b layernorm loading by @kar-m in #317
fix: LFM2 multiple tool calls by @mhayes853 in #316
fix hf publish by @jakmro in #323
update models list by @jakmro in #324
Fixing pip command errors by @rshemet in #322
Add instructions for installing Ruby version for xcodeproj gem by @jakmro in #327
tests: remove duplicate vlm_multiturn test in runner by @AI-I224 in #332
fix: replace NSLog with CACTUS_LOG for iOS NPU debuggability by @KayaanT in #328
Kernel_attention optimization by @Ayan9074 in #319
M4airbenchmarks by @Ayan9074 in #336
docs: update cactus test command description for transcribe models (#297) by @AI-I224 in #339
Accelerate FP16 matmul via cblas_sgemm for Apple AMX by @KayaanT in #340
Fix hybrid attention sliding window for Gemma (#320) by @jrajala6 in #338
bench: update README benchmark with M2 MacBook Air results by @vyomshah05 in #335
docs: add iPad Pro (12.9") (6th Gen) benchmarks (#296) by @AI-I224 in #333
removed unused graph i/o methods by @ncylich in #345
feat: cpp-native telemetry by @justinl66 in #326
Update CPP Telemetry to point to main DB by @justinl66 in #350
update python bindings for stream transcribe by @jakmro in #351
Update CPP Telemetry by @justinl66 in #352
added only flag by @nshejwalkar in #347
Added warmups and increased iterations for performance testing by @nshejwalkar in #355
CMF Phone 2 Pro benchmarks by @jakmro in #356
Vad by @jakmro in #353
Cli reconvert by @jakmro in #357
Asr cloud merging by @kar-m in #348
Add optional cloud key prompt for transcribe by @rshemet in #359
HF support multiple precision options by @jakmro in #361
Add precision parameter to download_from_hf by @jakmro in #362
revert silero download logic by @jakmro in #365
Cactus clean now clears cache, Session metrics initialized properly for telemetry by @justinl66 in #363
Curl prepack by @kar-m in #358
Fix/f16 reduction accum by @vyomshah05 in #344
Update telemetry by @justinl66 in #366
Accelerate FP16 attention via cblas_sgemm for Apple AMX by @KayaanT in #346

New Contributors

@AI-I224 made their first contribution in #332
@jrajala6 made their first contribution in #338
@vyomshah05 made their first contribution in #335
@nshejwalkar made their first contribution in #347

Full Changelog: v1.6.0...v1.7

@mhayes853 API has breaking changes

@HenryNdubuaku

What's Changed

Kernel Optimisations & advanced quantisation by @HenryNdubuaku
Moonshine by @kar-m
HF publish by @jakmro
Streaming API by @jakmro
Linux ARM support by @ncylich
Stop generation on model end token by @Ayan9074
i8MM runtime detection @mhayes853

FFI Note: This break API

@HenryNdubuaku

What's Changed

Groupwise quantisation by @HenryNdubuaku
Speech-To-Text streaming by @jakmro
KV Quntisation by @HenryNdubuaku
Evals by @justinl66 @ParkiratS
INT4 support by @HenryNdubuaku
Rust bindings by @mrsarac

Bindings: Please check Cactus FFIs again @jakmro @mrsarac @mhayes853

@jakmro

What's Changed

Cactus index by @jakmro
Function Gemma by @HenryNdubuaku
Perplexity eval to model by @ParkiratS
Tool refactor by @rshemet
F16 kernel updates by @KayaanT
Multi-turn VLM conversation continuity by @HenryNdubuaku
Bugfixes by @ammesatyajit

Instruction for bindings @mhayes853

Should easily replace v1.3 without headaches

@HenryNdubuaku

What's Changed

Apple NPU support by @HenryNdubuaku
Optimized softmax to use Horner's method by @ParkiratS
Tunix finetuning by @ncylich
PCM input stream to whisper transcription by @devabhixda
Cli mobile benchmarks by @jakmro
Telemetry framework by @devabhixda
Python bindings by @HenryNdubuaku

Bindings @devabhixda @jakmro @mhayes853

The CLI has changed a bit, read the cactus_ffi.h files properly.
If you need keys for the Pro, please reach out to [email protected]

Full Changelog: v1.2...v1.3

@HenryNdubuaku

Aggressive memory optimisations by @HenryNdubuaku

Binding Instrstructions:

Releases: cactus-compute/cactus

v1.11

What's Changed

New Contributors

Contributors

Uh oh!

v1.10

What's Changed

New Contributors

Contributors

Uh oh!

v1.9

Whats New

PRs

Breaking changes

Contributors

Uh oh!

v1.8

What's Changed

New Contributors

Contributors

Uh oh!

v1.7

What's Changed

New Contributors

Contributors

Uh oh!

v1.6

What's Changed

Contributors

Uh oh!

v1.5

What's Changed

Contributors

Uh oh!

v1.4

What's Changed

Contributors

Uh oh!

v1.3

What's Changed

Contributors

Uh oh!

v1.2

Contributors

Uh oh!