Skip to content

Conversation

@liyonghua0910
Copy link
Collaborator

解决频繁 clear/update 权重时 cache 进程概率挂掉的问题,核心原因是各 cache 进程清理权重后 cache_ready_signal 信号的同步逻辑写错了,导致 update 时 cache_ready_signal 信号不符合预期。另外新增日志中的 rank 信息和打印 cache_ready_signal 的值,方便日后定位问题。

@paddle-bot
Copy link

paddle-bot bot commented Sep 23, 2025

Thanks for your contribution!

@Jiang-Jia-Jun Jiang-Jia-Jun merged commit cb8d87b into PaddlePaddle:release/2.2 Sep 23, 2025
11 of 15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants