[recipe, test] feat: add unit tests for qwen2_audio and kimi_k25_vl recipes#3646
Merged
cuichenx merged 1 commit intoNVIDIA-NeMo:mainfrom May 5, 2026
Merged
Conversation
…ecipes
Two recipes under `src/megatron/bridge/recipes/` had no unit-test coverage:
- `qwen2_audio.qwen2_audio_7b_finetune_config`
- `kimi_vl.kimi_k25_vl_sft_config` (and `_get_kimi_k25_vl_pipeline_layout`)
Following the established pattern in `test_kimi_k2.py` and the GLM-4.5V
recipe tests, both new test modules monkeypatch `AutoBridge` to bypass
HuggingFace Hub I/O, then assert the recipe's documented contract end
to end:
`test_qwen2_audio_recipes.py` (18 tests):
- Basic ConfigContainer shape
- Default parallelism (TP=1, PP=1, no CP/SP)
- Default training cadence and validation settings
- seq_length propagation into both `model_cfg` and the dataset provider
- Default freeze flags (language model, audio model, audio projection)
- NullTokenizer wiring
- HFDatasetConversationProvider with the audio maker and
train/dev split selection
- Full-SFT (peft=None) path picks lr=5e-6
- peft="none" string is normalized to full SFT
- peft="lora" path picks lr=1e-4
- peft="dora" attaches a DoRA instance
- Unknown PEFT scheme raises a clear ValueError
- User-supplied `finetune_lr` overrides the default
- Freeze-flag overrides propagate to the provider
- Parallelism overrides apply
- seq_length override propagates to dataset provider
- DDP / checkpoint / RNG defaults
- val/test split overrides flow into the dataset provider
`test_kimi_k25_vl.py` (16 tests):
- Pipeline-layout helper: PP=1/VP=1 returns None, PP=16/VP=1 matches
the documented stage breakdown, all 7 known PP/VP combinations
return non-empty list-of-lists, vp_size=None is normalized to 1,
unknown combinations raise ValueError, layout is a fresh copy on
each call
- SFT recipe: ConfigContainer shape, training cadence, parallelism
(TP=2, PP=16, EP=32, SP on), full activation recompute (uniform, 1
layer), pipeline split flags off, layout matches helper output,
Muon-compatible DDP wiring (no dist optimizer, no param overlap),
HF processor path on dataset, NullTokenizer with vocab_size pulled
from the provider, bf16 mixed precision, fp32 optimizer precision,
`dist_muon` selected, MoE wiring (alltoall + deepep, fused permute,
grouped GEMM), TE backend with CUDA graphs off, kernel selections,
comm overlap defaults, checkpoint cadence, rope-fusion-off keeps
the experimental dist flag at its default `False`, rope-fusion-on
flips the experimental dist flag to `True`
No production code changes — tests only.
Refs issue NVIDIA-NeMo#3177.
Signed-off-by: lonexreb <[email protected]>
4 tasks
cuichenx
approved these changes
May 5, 2026
Contributor
|
/ok to test fda93c7 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Two recipes under
src/megatron/bridge/recipes/have no unit-test coverage. Add them.qwen2_audio.qwen2_audio_7b_finetune_config— 18 testskimi_vl.kimi_k25_vl_sft_config(and_get_kimi_k25_vl_pipeline_layout) — 16 testsBoth modules monkeypatch
AutoBridgeto bypass HuggingFace Hub I/O and assert the recipe's documented contract end-to-end. Pattern mirrors the existingtest_kimi_k2.pyandtest_glm_45v_recipes.py.Refs #3177.
Why this matters
The original issue notes that Kimi K2 had a recipe but no recipe test, which let an MCore-side API breaking change land undetected. Kimi K2 itself has since gained
tests/unit_tests/recipes/kimi/test_kimi_k2.py, but a sweep ofsrc/megatron/bridge/recipes/vstests/unit_tests/recipes/still surfaces several recipe modules with no recipe-test counterpart. This PR closes the easiest two, both of which are entirely missing today.What's covered
test_qwen2_audio_recipes.pytrain_iters=2000,GBS=32,MBS=1), validation settings.TP=1,PP=1),seq_lengthpropagation into bothmodel_cfgand theHFDatasetConversationProvider.make_default_audio_datasetmaker, defaulttrain/devsplit selection, processor path.peft=Noneandpeft=\"none\"→lr=5e-6;peft=\"lora\"→lr=1e-4;peft=\"dora\"attaches aDoRAinstance; unknown PEFT scheme raises a clearValueError.finetune_lroverrides the default.seq_lengthoverrides apply.torch_dist, parallel save), RNG seed, val/test maker overrides.test_kimi_k25_vl.py_get_kimi_k25_vl_pipeline_layouthelper:PP=1, VP=1returnsNone.PP=16, VP=1(the SFT default) matches the documented stage breakdown.(PP, VP)combinations return non-empty list-of-lists.vp_size=Noneis normalized to 1.ValueError.kimi_k25_vl_sft_config:TP=2,PP=16,EP=32,SPon, expert TP=1).uniform, 1 layer), pipeline split flags, layout matches helper output.use_distributed_optimizer=False,overlap_param_gather=False,no_shardstrategy).vocab_sizepulled from the provider.dist_muonselected.alltoall+deepepflex backend, fused permute, grouped GEMM, shared-expert overlap on).cross_entropy_fusion_impl=\"te\").tp_comm_overlap=False,delay_wgrad_compute=False).model.apply_rope_fusion=Falsethe experimental dist flag staysFalse; whenTruethe recipe flipscfg.dist.enable_megatron_core_experimental=True.Test plan
python3 -m astparse of new filesruff checkcleanruff format --checkcleanRisk
Zero — tests only. No production code changes.
Notes for reviewers
AttributeErrorhere, matching the existingtest_kimi_k2.pystyle.qwen2_audioreuses the existingHFDatasetConversationProviderdataclass, which is import-safe (no HF I/O at construction).apply_rope_fusion=True, exercising the recipe'sif cfg.model.apply_rope_fusion:branch.