Skip to content

fixed moonshine state resetting and gemma3 4b layernorm loading#317

Merged
HenryNdubuaku merged 10 commits intocactus-compute:mainfrom
BruinAI:karen/gemma-moonshine-fixes
Feb 4, 2026
Merged

fixed moonshine state resetting and gemma3 4b layernorm loading#317
HenryNdubuaku merged 10 commits intocactus-compute:mainfrom
BruinAI:karen/gemma-moonshine-fixes

Conversation

@kar-m
Copy link
Copy Markdown
Collaborator

@kar-m kar-m commented Feb 3, 2026

No description provided.

Copilot AI review requested due to automatic review settings February 3, 2026 07:52
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses Moonshine cache/state reset correctness and improves compatibility when loading Gemma3 4B layernorm weights, alongside a couple of related runtime behavior tweaks.

Changes:

  • Add an additional output norm key pattern (model.language_model.norm.weight) to support Gemma3 4B layernorm loading.
  • Fix Moonshine cache reset by invalidating and clearing persistent encoder-related node IDs (including last_encoder_post_norm_node_ and per-layer encoder K/V persistent nodes).
  • Update graph persistent invalidation semantics and remove Moonshine-specific trailing silence padding in the transcribe FFI path.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
python/src/weight_patterns.py Extends output norm name matching to recognize an additional HF weight key.
cactus/models/model_moonshine.cpp Ensures Moonshine reset clears/invalidates persistent nodes so encoder-derived cached nodes don’t survive across resets.
cactus/kernel/kernel_conv.cpp Changes Apple Accelerate conv1d weight staging order.
cactus/graph/graph_builder.cpp invalidate_persistent() now also removes IDs from the persistent-preservation set.
cactus/ffi/cactus_transcribe.cpp Removes Moonshine waveform trailing-silence insertion before passing audio to the model.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 711 to 714
void CactusGraph::invalidate_persistent(size_t persistent_node_id) {
populated_node_ids_.erase(persistent_node_id);
persistent_node_ids_.erase(persistent_node_id);
}
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

invalidate_persistent() now also removes the node ID from persistent_node_ids_, which changes soft_reset() behavior (invalidated nodes will no longer be preserved across resets). There doesn’t appear to be test coverage for this contract; please add a unit test that creates a persistent node, executes to populate it, calls invalidate_persistent(), then soft_reset(), and asserts the invalidated node is no longer preserved / addressable (and can be safely recreated).

Copilot uses AI. Check for mistakes.
kar-m added 9 commits February 3, 2026 09:28
Signed-off-by: Karen Mosoyan <[email protected]>
Signed-off-by: Karen Mosoyan <[email protected]>
Signed-off-by: Karen Mosoyan <[email protected]>
Signed-off-by: Karen Mosoyan <[email protected]>
Signed-off-by: Karen Mosoyan <[email protected]>
Signed-off-by: Karen Mosoyan <[email protected]>
@HenryNdubuaku HenryNdubuaku merged commit e7198d8 into cactus-compute:main Feb 4, 2026
1 of 2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants