Skip to content

Conversation

@bukejiyu
Copy link
Collaborator

@bukejiyu bukejiyu commented Aug 6, 2025

修复qwen3 0.3B 推理精度异常
修复qwq bias没有正确切分

@paddle-bot
Copy link

paddle-bot bot commented Aug 6, 2025

Thanks for your contribution!

param = params_dict[model_param_name]
weight_loader = getattr(param, "weight_loader", default_weight_loader(self.fd_config))
weight_loader(param, loaded_weight, shard_id)
if self.tie_word_embeddings and "embed_tokens" in loaded_weight_name:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

直接在这里写 if self.tie_word_embeddings: self.lm_head.linear.weight.set_value(self.model.embed_tokens.embeddings.weight.transpose([1, 0]))不行吗,为啥搞的这么复杂

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@bukejiyu bukejiyu changed the title [bugfix]qwen3_fix [bugfix]qwen3_fix and qwq fix Aug 7, 2025
@Jiang-Jia-Jun Jiang-Jia-Jun merged commit b76b17f into PaddlePaddle:develop Aug 8, 2025
11 of 14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants