Vad by jakmro · Pull Request #353 · cactus-compute/cactus

jakmro · 2026-02-15T04:21:40Z

No description provided.

Signed-off-by: jakmro <[email protected]>

Copilot

Pull request overview

This pull request adds comprehensive Voice Activity Detection (VAD) support to the Cactus Engine using the Silero VAD model. The implementation includes a new model type, graph operations for LSTM cells and activation functions (ReLU, Sigmoid), and integration with Whisper/Moonshine transcription models for automatic speech preprocessing.

Changes:

Implemented Silero VAD model with LSTM-based speech detection
Added VAD API endpoint (cactus_vad) for standalone speech segment detection
Integrated VAD preprocessing into transcription workflow (enabled by default)
Created comprehensive language bindings for Python, Flutter, Swift, and Kotlin
Added test infrastructure and automatic VAD weight bundling during model conversion

Reviewed changes

Copilot reviewed 38 out of 38 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
cactus/models/model_silero_vad.cpp	Core VAD model implementation with STFT, encoder blocks, LSTM cell, and timestamp detection
cactus/models/model.h	VAD model class definition with configuration structures
cactus/kernel/kernel_lstm.cpp	SIMD-optimized LSTM cell kernel using ARM NEON intrinsics
cactus/kernel/kernel_nn.cpp	Added ReLU and Sigmoid activation functions
cactus/kernel/kernel.h	Exposed new kernel functions
cactus/graph/graph_ops_nn.cpp	LSTM cell graph operation implementation
cactus/graph/graph_ops_math.cpp	ReLU and Sigmoid graph operation dispatch
cactus/graph/graph_builder.cpp	Builder methods for LSTM and activation operations
cactus/graph/graph_execute.cpp	Execution dispatcher for new operations
cactus/graph/graph.h	OpType enum additions
cactus/ffi/cactus_vad.cpp	FFI implementation for VAD endpoint with JSON option parsing
cactus/ffi/cactus_init.cpp	Automatic VAD model initialization for Whisper/Moonshine
cactus/ffi/cactus_transcribe.cpp	VAD preprocessing integration in transcription
cactus/ffi/cactus_utils.h	Added use_vad option parsing
cactus/ffi/cactus_complete.cpp	Updated option parsing signature
cactus/ffi/cactus_ffi.h	VAD function declaration
cactus/engine/engine_model.cpp	Silero VAD model type registration
cactus/engine/engine.h	Added SILERO_VAD to ModelType enum
python/src/converter_silero_vad.py	New converter for Silero VAD weights
python/src/converter_llm.py	Automatic VAD bundling for transcription models
python/src/cactus.py	Python VAD API binding
python/src/cli.py	CLI support for VAD model download and testing
python/src/publish_to_hf.py	Added Silero VAD to published models
python/requirements.txt	Added torchaudio dependency
flutter/cactus.dart	Flutter VAD bindings
apple/Cactus.swift	Swift VAD bindings
android/Cactus.kt	Android VAD bindings
android/Cactus.common.kt	Kotlin common VAD types
android/Cactus.android.kt	Android-specific VAD implementation
android/Cactus.ios.kt	iOS Kotlin Multiplatform VAD implementation
tests/test_engine.cpp	VAD test implementation
tests/run.sh	VAD model parameter and environment setup
tests/android/run.sh	Android VAD test configuration
tests/ios/run.sh	iOS VAD test configuration
tests/ios/configure_xcode.rb	Xcode project VAD model copying
tests/ios/CactusTest/CactusTest/AppDelegate.mm	iOS VAD model setup with error handling
docs/cactus_engine.md	VAD API documentation
README.md	Added Silero VAD to supported models table

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

tests/test_engine.cpp

cactus/ffi/cactus_transcribe.cpp

python/src/converter_silero_vad.py

python/requirements.txt

cactus/ffi/cactus_init.cpp

…te documentation Signed-off-by: jakmro <[email protected]>

Signed-off-by: jakmro <[email protected]>

HenryNdubuaku

Is the weight combination, switch to HF and --reconvert flag gonna be another PR?2. When you remove the venv and start source ./setup afresh, it breaks and crashes with torch audio.

After fixing those 2, I will then test finally

Signed-off-by: jakmro <[email protected]>

Signed-off-by: HenryNdubuaku <[email protected]>

HenryNdubuaku · 2026-02-16T02:45:03Z

@jakmro I have optimised the VAD pipeline now, should be 3x faster, study my PR to your branch carefully, to understand the tricks we use to optimise.

Fused STFT+magnitude kernel
Avoid repeated memory allocation, Pre-allocated process_chunk buffers & reuse in get_speech_timestamps
Not every convolution/matmul op benefits from threading, so best to use a threshold to decide when to thread
Bulk WAV read
Windowed-sinc resampler

* port silero vad Signed-off-by: jakmro <[email protected]> * align silero vad conversion process Signed-off-by: jakmro <[email protected]> * . Signed-off-by: jakmro <[email protected]> * lstm kernel Signed-off-by: jakmro <[email protected]> * bundle vad into s2t models Signed-off-by: jakmro <[email protected]> * clean Signed-off-by: jakmro <[email protected]> * docs Signed-off-by: jakmro <[email protected]> * add silero-vad to publish list Signed-off-by: jakmro <[email protected]> * return early when no speech Signed-off-by: jakmro <[email protected]> * update cactus_vad return value to reflect JSON response size and update documentation Signed-off-by: jakmro <[email protected]> * clean Signed-off-by: jakmro <[email protected]> * refactor test_vad_process Signed-off-by: jakmro <[email protected]> * update setup script to require Python 3.12 Signed-off-by: jakmro <[email protected]> * warning fixes Signed-off-by: HenryNdubuaku <[email protected]> * Aggresively optimise VAD Signed-off-by: HenryNdubuaku <[email protected]> --------- Signed-off-by: jakmro <[email protected]> Signed-off-by: HenryNdubuaku <[email protected]> Co-authored-by: HenryNdubuaku <[email protected]>

jakmro added 10 commits February 11, 2026 01:46

port silero vad

5dce72e

Signed-off-by: jakmro <[email protected]>

align silero vad conversion process

fb0befb

Signed-off-by: jakmro <[email protected]>

.

e876794

Signed-off-by: jakmro <[email protected]>

lstm kernel

bcef02c

Signed-off-by: jakmro <[email protected]>

bundle vad into s2t models

9d5ee79

Signed-off-by: jakmro <[email protected]>

clean

6092319

Signed-off-by: jakmro <[email protected]>

docs

660071f

Signed-off-by: jakmro <[email protected]>

add silero-vad to publish list

4938972

Signed-off-by: jakmro <[email protected]>

Merge branch 'main' into vad

ceb0dcb

Signed-off-by: jakmro <[email protected]>

return early when no speech

b276336

Signed-off-by: jakmro <[email protected]>

jakmro marked this pull request as ready for review February 15, 2026 05:35

Copilot AI review requested due to automatic review settings February 15, 2026 05:35

Copilot started reviewing on behalf of jakmro February 15, 2026 05:35 View session

Copilot AI reviewed Feb 15, 2026

View reviewed changes

tests/test_engine.cpp Outdated Show resolved Hide resolved

cactus/ffi/cactus_transcribe.cpp Show resolved Hide resolved

python/src/converter_silero_vad.py Show resolved Hide resolved

python/requirements.txt Show resolved Hide resolved

cactus/ffi/cactus_init.cpp Show resolved Hide resolved

jakmro added 3 commits February 15, 2026 06:55

update cactus_vad return value to reflect JSON response size and upda…

6f299f9

…te documentation Signed-off-by: jakmro <[email protected]>

clean

1f98c65

Signed-off-by: jakmro <[email protected]>

refactor test_vad_process

b84762f

Signed-off-by: jakmro <[email protected]>

HenryNdubuaku reviewed Feb 15, 2026

View reviewed changes

jakmro and others added 3 commits February 15, 2026 21:14

update setup script to require Python 3.12

fb9e6df

Signed-off-by: jakmro <[email protected]>

warning fixes

4b6df0a

Signed-off-by: HenryNdubuaku <[email protected]>

Aggresively optimise VAD

8fd842d

Signed-off-by: HenryNdubuaku <[email protected]>

HenryNdubuaku merged commit 678a6b4 into main Feb 16, 2026
1 of 2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Vad#353

Vad#353
HenryNdubuaku merged 16 commits intomainfrom
vad

jakmro commented Feb 15, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

HenryNdubuaku left a comment

Uh oh!

HenryNdubuaku commented Feb 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

jakmro commented Feb 15, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

HenryNdubuaku left a comment

Choose a reason for hiding this comment

Uh oh!

HenryNdubuaku commented Feb 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants