Skip to content

Conversation

@xiaoguoguo626807
Copy link
Contributor

@xiaoguoguo626807 xiaoguoguo626807 commented Sep 1, 2025

PR Category

Execute Infrastructure

PR Types

Improvements

Description

pcard-67164

将完整模型的key 均匀分布给多个卡进行参数合并

单测在distribute_stable ci 中运行

@paddle-bot
Copy link

paddle-bot bot commented Sep 1, 2025

你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@codecov-commenter
Copy link

Codecov Report

❌ Patch coverage is 1.78571% with 55 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (develop@2f7ce55). Learn more about missing BASE report.

Files with missing lines Patch % Lines
...distributed/flex_checkpoint/dcp/load_state_dict.py 1.78% 55 Missing ⚠️

❌ Your patch status has failed because the patch coverage (1.78%) is below the target coverage (90.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files
@@            Coverage Diff             @@
##             develop   #75005   +/-   ##
==========================================
  Coverage           ?    1.78%           
==========================================
  Files              ?        1           
  Lines              ?       56           
  Branches           ?        0           
==========================================
  Hits               ?        1           
  Misses             ?       55           
  Partials           ?        0           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@xiaoguoguo626807 xiaoguoguo626807 merged commit cb3a8e2 into PaddlePaddle:develop Sep 2, 2025
56 of 58 checks passed
@xiaoguoguo626807 xiaoguoguo626807 deleted the dist_megre branch September 2, 2025 03:00
xingmingyyj pushed a commit to xingmingyyj/Paddle that referenced this pull request Nov 5, 2025
…le#75005)

* fix data is nullptr

* add dist merge

* change test

* change test
sneaxiy pushed a commit that referenced this pull request Nov 6, 2025
….2 (#76249)

* 【FlexCP】merge_sharded_state_dict support distribute merge (#75005)

* fix data is nullptr

* add dist merge

* change test

* change test

* 【FlexCP】add Skip param param for merge_shard_state_dict (#75061)

* fix data is nullptr

* add dist merge

* change test

* change test

* add skip optimizer param

* [Flex CP]Fix merge_sharded_state_dict with aoa and offload (#75062)

* fix merge_state_dict with aoa and offload

* add tests

* refine

* fix

* fix

* add log

* fix

* fix

* 【FlexCheckpoint】Upgrade some macros and optimize load_state_dict communication (#75282)

* upgrad macros and load_state_dict comm task

fix

fix

support 0-d tensor

fix

balance save and fix

* fix test

* Add the test about the sharded_state_dict of optimizer  (#75067)

* fix the share_weight_bug

* add note

* add the unit test

* set the timeout

* add more test

* Trigger CI rebuild

* fix the CmakeLists

* handle_missing_edge_cases_in_fc (#75413)

* up_grade fc (#75613)

fix and add test

fix

fix

fix

fix cmakelists

add notion

* 【FlexCheckpoint】fix_the_layer_id_macro (#75556)

* fix_the_layer_id_macro

* fix the ctest

* add expert_id_macro

* fix the assert bug

* fix the code style

* Pr support load hf checkpoint (#75928)

* support hf checkpoint

fix

support cast

add id macro

fix

* add test and fix some bug

* fix full param bug

* add full param cast test

---------

Co-authored-by: xingmingyyj <[email protected]>

* 【Flexcheckpoint】add_get_var_mapping_chain_macro (#76013)

* add_get_var_mapping_chain_macro

* add note

* fix the bug input_vars and resolve_mapping_chain

* fix the code style

* fit the dtype assert bug

* fix the bug

* fix the merge_sharded_state_dict bug

* fix aoa transpose corner case (#76234)

---------

Co-authored-by: xiaoguoguo626807 <[email protected]>
Co-authored-by: Chen Zhiyang <[email protected]>
Co-authored-by: Tianyu Zheng <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants