Skip to content

Conversation

@rahul713rk
Copy link
Contributor

closes #7415

Description

Resets bucket.elements after reduction in ZeRO Stage 3.
Without this, the bucket grows indefinitely, reducing only one param at a time.
Added bucket.elements = 0 after params_in_bucket.clear().

Copy link
Collaborator

@tohtana tohtana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rahul713rk Thank you for the fix!

@tohtana tohtana enabled auto-merge (squash) July 8, 2025 01:16
@tohtana tohtana merged commit 2722846 into deepspeedai:master Jul 8, 2025
11 checks passed
lpnpcs pushed a commit to lpnpcs/DeepSpeed that referenced this pull request Jul 30, 2025
…pspeedai#7418)

closes deepspeedai#7415 

# Description
Resets `bucket.elements` after reduction in ZeRO Stage 3.
Without this, the bucket grows indefinitely, reducing only one param at
a time.
Added `bucket.elements = 0` after `params_in_bucket.clear()`.

Co-authored-by: a <a>
mauryaavinash95 pushed a commit to DataStates/DeepSpeed that referenced this pull request Oct 4, 2025
…pspeedai#7418)

closes deepspeedai#7415 

# Description
Resets `bucket.elements` after reduction in ZeRO Stage 3.
Without this, the bucket grows indefinitely, reducing only one param at
a time.
Added `bucket.elements = 0` after `params_in_bucket.clear()`.

Co-authored-by: a <a>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG]bucket.elements is not correctly cleared in ZeRO Stage3

2 participants