-
Notifications
You must be signed in to change notification settings - Fork 5.9k
[Distributed] Add PipelineDatasetPreprocessor to aviod mem leaks in pipeline parallel #75446
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Distributed] Add PipelineDatasetPreprocessor to aviod mem leaks in pipeline parallel #75446
Conversation
|
你的PR提交成功,感谢你对开源项目的贡献! |
|
/rerun all-failed |
4 similar comments
|
/rerun all-failed |
|
/rerun all-failed |
|
/rerun all-failed |
|
/rerun all-failed |
a7fb152 to
c28eab5
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## develop #75446 +/- ##
===========================================
Coverage ? 100.00%
===========================================
Files ? 1
Lines ? 9
Branches ? 0
===========================================
Hits ? 9
Misses ? 0
Partials ? 0 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
| micro_batch_data = self._load_micro_batch(self._index) | ||
| self._index += 1 | ||
|
|
||
| if self._index >= self._acc_steps: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
上面不是已经raise StopIteration,这个判断是否有问题。
ForFishes
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR Category
Distributed Strategy
PR Types
Improvements
Description
Pass a local function to forward_backward_pipeline instead of the dataset itself. This prevents the dataset from being passed as a direct argument to forward_backward_pipeline, which would create additional reference counts that cannot be cleared, leading to GPU memory leaks.
pcard-76459