Skip to content

Add Gemma 3N (text-only) model support#493

Merged
HenryNdubuaku merged 23 commits intomainfrom
3n
Mar 7, 2026
Merged

Add Gemma 3N (text-only) model support#493
HenryNdubuaku merged 23 commits intomainfrom
3n

Conversation

@ncylich
Copy link
Copy Markdown
Collaborator

@ncylich ncylich commented Mar 4, 2026

Summary

  • Adds text-only Gemma 3N (E4B) inference with AltUp (4-stream alternating updates), Laurel (learned augmented residual layers), and Per-Layer Input (PLI)
  • Implements hybrid local/global attention with sliding window, KV-cache sharing for the last 10 layers, per-head QK normalization, V normalization, and Gaussian top-k activation sparsity
  • Adds weight conversion pipeline and config parsing for gemma3n model type

ncylich added 3 commits March 3, 2026 23:12
Signed-off-by: Noah Cylich <[email protected]>
Signed-off-by: Noah Cylich <[email protected]>
Copilot AI review requested due to automatic review settings March 4, 2026 18:52
Signed-off-by: Noah Cylich <[email protected]>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds runtime + conversion support for a new gemma3n model type, including Gemma 3N-specific config parsing, weight export patterns, and a new C++ model implementation integrated into the engine/tooling paths.

Changes:

  • Add Gemma 3N config extraction + model-type detection in the Python conversion pipeline.
  • Export Gemma 3N-specific weights (AltUp/Laurel/PLI and tower-prefixed tensors) during conversion.
  • Introduce GemmaModel3n in the C++ runtime and wire GEMMA3N into engine + tool-calling code paths.

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
python/src/weight_patterns.py Adds Gemma 3N global weight mappings and new per-layer patterns.
python/src/tensor_io.py Tweaks precision override rules for embed-related tensors (incl. embed_tokens_per_layer).
python/src/converter.py Adds gemma3n config extraction and Gemma 3N-specific weight export (global + tower prefixes).
python/src/config_utils.py Adds gemma3n detection and Gemma 3N-specific config extraction (AltUp/Laurel/rope/etc.).
python/requirements.txt Adds new Python dependencies (timm, sentencepiece).
cactus/models/model_gemma3n.cpp New Gemma 3N model implementation (AltUp/Laurel/PLI + hybrid attention).
cactus/models/model.h Declares GemmaModel3n and its weight-node layout.
cactus/ffi/cactus_complete.cpp Treats GEMMA3N like GEMMA for tool formatting + stop sequences.
cactus/engine/engine_model.cpp Adds GEMMA3N attention scaling, config parsing fields, logit softcapping, and model factory wiring.
cactus/engine/engine_constraints.cpp Enables Gemma-style tool-call constraints for GEMMA3N.
cactus/engine/engine.h Extends config/model-type enums and adds Gemma 3N config fields.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

ncylich added 9 commits March 4, 2026 12:29
Signed-off-by: Noah Cylich <[email protected]>
Signed-off-by: Noah Cylich <[email protected]>
This reverts commit 3416df3.

Signed-off-by: Noah Cylich <[email protected]>
This reverts commit 941c2f8.

Signed-off-by: Noah Cylich <[email protected]>
Signed-off-by: Noah Cylich <[email protected]>
Signed-off-by: Noah Cylich <[email protected]>
This reverts commit 2edf384.

Signed-off-by: Noah Cylich <[email protected]>
Signed-off-by: Noah Cylich <[email protected]>
@HenryNdubuaku
Copy link
Copy Markdown
Collaborator

@ncylich we need to resolve the conflicts, also you need to add gemma 3n to the readme

@HenryNdubuaku HenryNdubuaku merged commit 81b6b3c into main Mar 7, 2026
4 of 6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants