Skip to content

Conversation

@bukejiyu
Copy link
Collaborator

@bukejiyu bukejiyu commented Aug 28, 2025

减少eb300B模型loading耗时 10min->3min

@paddle-bot
Copy link

paddle-bot bot commented Aug 28, 2025

Thanks for your contribution!

@codecov-commenter
Copy link

Codecov Report

✅ All modified and coverable lines are covered by tests.
⚠️ Please upload report for BASE (develop@ce9c091). Learn more about missing BASE report.

Additional details and impacted files
@@            Coverage Diff             @@
##             develop    #3700   +/-   ##
==========================================
  Coverage           ?   87.50%           
==========================================
  Files              ?        4           
  Lines              ?       16           
  Branches           ?        3           
==========================================
  Hits               ?       14           
  Misses             ?        0           
  Partials           ?        2           
Flag Coverage Δ
diff 87.50% <ø> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@bukejiyu bukejiyu changed the title [v1loader]tmp fix [v1loader]Reduce EB300B model loading time Sep 1, 2025
YuanRisheng
YuanRisheng previously approved these changes Sep 1, 2025
if shard_id is None:
# 1.gate up fused in disk
model_format = getattr(param, "model_format", "")
is_opensource_weight = model_format == "torch"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is_opensource_weight这个命名不妥,ernie不也是所谓的“开源权重”吗?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

改成 is_torch_model了

per_rank = output_size // 2
start = self.tp_rank * per_rank
loaded_weight_shard_gate = slice_fn(
loaded_weight, is_opensource_weight ^ SHARD_ID_TO_SHARDED_DIM["gate"], start, start + per_rank
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里取异或操作具体是想表达什么意思呢?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

删掉了 transpose,所以切分的维度 要取反,异或为了取反

if self.tp_size > 1:
is_opensource_weight = model_format == "torch"
if self.tp_size > 1 and not is_sharded:
weight_shard_dim = is_opensource_weight ^ shard_dim
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

其他地方都还是shard_dim的命名,那这里改为weight_shard_dim是有什么特殊的含义吗?weight_loader里load的不是weight还会是其他的吗?

Copy link
Collaborator Author

@bukejiyu bukejiyu Sep 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

因为删了 transpose ,disk权重的切分维度 和 paramter的切分维度刚好相反,那我改成tp_shard_dim 吧

@bukejiyu bukejiyu merged commit b6a4115 into PaddlePaddle:develop Sep 2, 2025
15 of 17 checks passed
bukejiyu added a commit to bukejiyu/FastDeploy that referenced this pull request Sep 2, 2025
Jiang-Jia-Jun pushed a commit that referenced this pull request Sep 3, 2025
bukejiyu added a commit to bukejiyu/FastDeploy that referenced this pull request Sep 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants