StemgenRT

A real-time low-latency music source separation plugin. Drop it on a track and get 4 separate stems: drums, bass, other, and vocals.

With a latency of 11.6 milliseconds, it is made for spatializing DJ sets in real-time: split the mix into stems, place them in the room, and create an immersive experience.

Built with JUCE and ONNX Runtime, using HS-TasNet.

Available as VST3 and AU.

Usage

StemgenRT is a multi-output plugin with 4 stereo output buses:

Drums
Bass
Other (synths, guitars, etc.)
Vocals

To set it up:

Insert StemgenRT on your source track (e.g., a DJ mix or full song)
Create 4 auxiliary/bus tracks to receive each stem
Route each of the plugin's outputs to its corresponding aux track

Check your DAW's documentation for multi-output plugin routing.

Note

Set your DAW to 44.1 kHz, the model only works at this sample rate.

Downloads

Note

The macOS plugin is not signed (yet). You need to sign it yourself: codesign --force --deep --sign - StemgenRT.component

Building

First, grab the ONNX Runtime dependency:

# macOS
./scripts/download-onnxruntime.sh

# Windows (PowerShell)
./scripts/download-onnxruntime.ps1

Then build with CMake:

cmake -S . -B build
cmake --build build

For a release build:

cmake -S . -B build-release
cmake --build build-release

How it works

The plugin runs a neural network to separate audio, but neural networks are slow and audio callbacks are fast. To bridge the gap:

Audio thread collects incoming samples and feeds them to a ring buffer
Inference thread runs the model asynchronously in the background
Audio thread picks up the processed stems when ready

If inference can't keep up, the plugin gracefully crossfades to the dry signal rather than glitching.

A few DSP tricks help the model out:

80Hz crossover — Low frequencies bypass the network entirely. The model struggles with sub-bass, and this keeps the low end tight.
Input normalization — Quiet signals are boosted to a consistent level before inference, then scaled back after. This pushes the model's noise floor below the signal level, dramatically improving quality on quiet passages.
Vocals gate — Detects spurious low-level content in the vocals stem (common on instrumentals) using both energy ratio and absolute level thresholds. Gated content is transferred to the "other" stem to preserve total energy. Asymmetric attack/release smoothing prevents pumping.
Soft gating — When input is silent, output is silent. Prevents the model from hallucinating noise.

A note on GPU acceleration

You might expect GPU to be faster, but for this particular model it often isn't:

1D convolutions — GPUs are optimized for 2D (images). 1D audio convolutions don't parallelize as well.
Batch size of 1 — Real-time audio processes one chunk at a time. GPUs shine with large batches.
Memory-bound ops — Reshapes and audio operations are limited by memory bandwidth, not compute. Your CPU cache is actually fast for this.
Kernel launch overhead — Each GPU operation has ~5-20μs overhead. With many small ops, it adds up.

That said, GPU builds are available if you want to try.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.claude		.claude
.github/workflows		.github/workflows
cmake		cmake
model		model
plugin		plugin
screenshots		screenshots
scripts		scripts
test		test
.clang-format		.clang-format
.cmake-format.yaml		.cmake-format.yaml
.gitattributes		.gitattributes
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CLAUDE.md		CLAUDE.md
CMakeLists.txt		CMakeLists.txt
CMakePresets.json		CMakePresets.json
LICENSE.md		LICENSE.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

StemgenRT

Usage

Downloads

Building

How it works

A note on GPU acceleration

License

About

Uh oh!

Releases 1

Packages

Languages

License

sweetspotsoundsystem/stemgen-rt

Folders and files

Latest commit

History

Repository files navigation

StemgenRT

Usage

Downloads

Building

How it works

A note on GPU acceleration

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages