Skip to content

Conversation

@tohtana
Copy link
Collaborator

@tohtana tohtana commented Sep 28, 2025

This PR improves the usability of the leaf module feature.

Here are the changes:

  • Allow enabling the leaf module via both the DeepSpeed config and APIs.
  • Relax matching criteria to support class-based matching.
  • Support multiple ways of specifying the target module: class, class name (with or without package name), module name, or suffix.
  • Add documentation to the training guide, including config snippets and explanations of default behavior.
  • Add default classes (e.g., Mixtral, Qwen2/Qwen3) that automatically enable the leaf module feature. (Welcoming requests to add more classes)

@sfc-gh-truwase sfc-gh-truwase enabled auto-merge (squash) October 3, 2025 09:25
@sfc-gh-truwase sfc-gh-truwase merged commit 7d9a2f2 into deepspeedai:master Oct 3, 2025
10 of 12 checks passed
mauryaavinash95 pushed a commit to DataStates/DeepSpeed that referenced this pull request Oct 4, 2025
…eria, add document, etc.) (deepspeedai#7604)

This PR improves the usability of the leaf module feature.

Here are the changes:
- Allow enabling the leaf module via both the DeepSpeed config and APIs.
- Relax matching criteria to support class-based matching.
- Support multiple ways of specifying the target module: class, class
name (with or without package name), module name, or suffix.
- Add documentation to the training guide, including config snippets and
explanations of default behavior.
- Add default classes (e.g., Mixtral, Qwen2/Qwen3) that automatically
enable the leaf module feature. (Welcoming requests to add more classes)

---------

Signed-off-by: Masahiro Tanaka <[email protected]>
Co-authored-by: Olatunji Ruwase <[email protected]>
Comment on lines +135 to +138
"leaf_module": {
"classes": ["my_package.layers.CustomMoEBlock"],
"names": ["transformer.layers.0.experts"],
"name_suffixes": ["experts"]
Copy link
Contributor

@sfc-gh-sbekman sfc-gh-sbekman Oct 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part of the doc is somewhat confusing since it may make the reader believe all 3 entries are needed if they jumped here directly not reading the API section.

Too bad json doesn't allow comments. But perhaps the first paragraph could start with - "While the example shows all 3 leaf_module keys, typically you will probably use just one of these" ?

tohtana added a commit that referenced this pull request Oct 7, 2025
Update document of leaf module config as suggested
[here](#7604 (comment)).

Signed-off-by: Masahiro Tanaka <[email protected]>
Liangliang-Ma pushed a commit to Liangliang-Ma/DeepSpeed that referenced this pull request Oct 13, 2025
…eria, add document, etc.) (deepspeedai#7604)

This PR improves the usability of the leaf module feature.

Here are the changes:
- Allow enabling the leaf module via both the DeepSpeed config and APIs.
- Relax matching criteria to support class-based matching.
- Support multiple ways of specifying the target module: class, class
name (with or without package name), module name, or suffix.
- Add documentation to the training guide, including config snippets and
explanations of default behavior.
- Add default classes (e.g., Mixtral, Qwen2/Qwen3) that automatically
enable the leaf module feature. (Welcoming requests to add more classes)

---------

Signed-off-by: Masahiro Tanaka <[email protected]>
Co-authored-by: Olatunji Ruwase <[email protected]>
Signed-off-by: Ma, Liangliang <[email protected]>
Liangliang-Ma pushed a commit to Liangliang-Ma/DeepSpeed that referenced this pull request Oct 13, 2025
Update document of leaf module config as suggested
[here](deepspeedai#7604 (comment)).

Signed-off-by: Masahiro Tanaka <[email protected]>
Signed-off-by: Ma, Liangliang <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants