Low-precision master params/grads/optimizer states #7700

tohtana · 2025-11-20T02:13:13Z

DeepSpeed optimizer always creates fp32 master params/gradients/optimizer states.
However, we sometimes want to keep them lower precision given torch.autocast support.
This PR allows lower precision master params/grads/optimizer states with bf16/fp16 enabled.

DeepSpeed currently accepts fp16_master_weights_and_gradients option under fp16 section (not documented) with ZeRO1/2. This PR extends this for bf16 and also ZeRO3.

In bf16 section, we can have new items bf16_master_weights_and_grads and bf16_optimizer_states.
Similary to fp16_master_weights_and_grads, bf16_master_weights_and_grads keeps master parameters in bf16. bf16_optimizer_states keeps optimizer states also in bf16. Here is an example configuration:

        "bf16": {
            "enabled": true,
            "bf16_master_weights_and_grads": true,
            "bf16_optimizer_states": true
        }

Note that bf16_master_weights_and_grads==True and bf16_optimizer_states==False is supported only with cpu offloading. Also, we don't have fp16_optimizer_states as it won't be practical. More details are described in config-json.md

Previously, torch.autocast support (torch_autocast section in config) was not compatible with bf16 fp16 enabled, but we now accept the combination.

This PR also adds some test cases for the configurations as well as the combination with torch.autocast.

Signed-off-by: Masahiro Tanaka <[email protected]>

deepspeed/runtime/zero/stage_1_and_2.py

docs/_pages/config-json.md

Signed-off-by: Masahiro Tanaka <[email protected]>

tohtana added 5 commits November 19, 2025 00:01

add options for pure bf16

a4c6362

Signed-off-by: Masahiro Tanaka <[email protected]>

add low precision master weight mode

b5dcaa8

Signed-off-by: Masahiro Tanaka <[email protected]>

allow fp16+z1

dbd39ae

Signed-off-by: Masahiro Tanaka <[email protected]>

test combination with autocast

88e0bbd

Signed-off-by: Masahiro Tanaka <[email protected]>

update document

3136992

Signed-off-by: Masahiro Tanaka <[email protected]>

tohtana marked this pull request as ready for review November 20, 2025 02:51

tohtana requested review from loadams and tjruwase as code owners November 20, 2025 02:51

tohtana and others added 3 commits November 25, 2025 05:46

Merge branch 'master' into tohtana/pure_bf16_config

54316e4

fix formatting

a2f6941

Signed-off-by: Masahiro Tanaka <[email protected]>

Merge branch 'master' into tohtana/pure_bf16_config

dc1f4a7

sfc-gh-truwase reviewed Dec 2, 2025

View reviewed changes

deepspeed/runtime/zero/stage_1_and_2.py Show resolved Hide resolved

sfc-gh-truwase reviewed Dec 2, 2025

View reviewed changes

docs/_pages/config-json.md Show resolved Hide resolved

sfc-gh-truwase approved these changes Dec 2, 2025

View reviewed changes

tohtana and others added 5 commits December 4, 2025 11:52

Merge branch 'master' into tohtana/pure_bf16_config

886b1fc

add comment

71ac2af

Signed-off-by: Masahiro Tanaka <[email protected]>

simplify description in config json

5400979

Signed-off-by: Masahiro Tanaka <[email protected]>

fix sphinx format issue (subsections not rendered)

c70e15b

Signed-off-by: Masahiro Tanaka <[email protected]>

Merge branch 'master' into tohtana/pure_bf16_config

769339c

tohtana enabled auto-merge (squash) December 4, 2025 03:33

tohtana merged commit 39a682d into deepspeedai:master Dec 4, 2025
11 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Low-precision master params/grads/optimizer states #7700

Low-precision master params/grads/optimizer states #7700

Uh oh!

tohtana commented Nov 20, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Low-precision master params/grads/optimizer states #7700

Low-precision master params/grads/optimizer states #7700

Uh oh!

Conversation

tohtana commented Nov 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

tohtana commented Nov 20, 2025 •

edited

Loading