[v1loader]Reduce EB300B model loading time #3700

bukejiyu · 2025-08-28T16:44:59Z

减少eb300B模型loading耗时 10min->3min

paddle-bot · 2025-08-28T16:45:04Z

Thanks for your contribution!

codecov-commenter · 2025-08-28T18:15:23Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
⚠️ Please upload report for BASE (develop@ce9c091). Learn more about missing BASE report.

Additional details and impacted files

@@            Coverage Diff             @@
##             develop    #3700   +/-   ##
==========================================
  Coverage           ?   87.50%           
==========================================
  Files              ?        4           
  Lines              ?       16           
  Branches           ?        3           
==========================================
  Hits               ?       14           
  Misses             ?        0           
  Partials           ?        2

Flag	Coverage Δ
diff	`87.50% <ø> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

yuanlehome · 2025-09-01T06:59:38Z

fastdeploy/model_executor/layers/moe/moe.py

        if shard_id is None:
            # 1.gate up fused in disk
+            model_format = getattr(param, "model_format", "")
+            is_opensource_weight = model_format == "torch"


is_opensource_weight这个命名不妥，ernie不也是所谓的“开源权重”吗？

改成 is_torch_model了

yuanlehome · 2025-09-01T07:01:52Z

fastdeploy/model_executor/layers/moe/moe.py

+            per_rank = output_size // 2
+            start = self.tp_rank * per_rank
+            loaded_weight_shard_gate = slice_fn(
+                loaded_weight, is_opensource_weight ^ SHARD_ID_TO_SHARDED_DIM["gate"], start, start + per_rank


这里取异或操作具体是想表达什么意思呢？

删掉了 transpose，所以切分的维度要取反，异或为了取反

yuanlehome · 2025-09-01T07:03:51Z

fastdeploy/model_executor/layers/moe/moe.py

-        if self.tp_size > 1:
+        is_opensource_weight = model_format == "torch"
+        if self.tp_size > 1 and not is_sharded:
+            weight_shard_dim = is_opensource_weight ^ shard_dim


其他地方都还是shard_dim的命名，那这里改为weight_shard_dim是有什么特殊的含义吗？weight_loader里load的不是weight还会是其他的吗？

因为删了 transpose ，disk权重的切分维度和 paramter的切分维度刚好相反，那我改成tp_shard_dim 吧

* speed up eb45 * update

speed up eb45

497034f

bukejiyu force-pushed the loader_speed branch from 29270b3 to 497034f Compare September 1, 2025 06:44

bukejiyu changed the title ~~[v1loader]tmp fix~~ [v1loader]Reduce EB300B model loading time Sep 1, 2025

YuanRisheng previously approved these changes Sep 1, 2025

View reviewed changes

yuanlehome reviewed Sep 1, 2025

View reviewed changes

update

799fcde

bukejiyu dismissed YuanRisheng’s stale review via 799fcde September 1, 2025 09:40

bukejiyu added 2 commits September 1, 2025 21:36

Merge branch 'develop' into loader_speed

b26c788

Merge branch 'develop' into loader_speed

85b2116

YuanRisheng approved these changes Sep 2, 2025

View reviewed changes

bukejiyu merged commit b6a4115 into PaddlePaddle:develop Sep 2, 2025
15 of 17 checks passed

bukejiyu added a commit to bukejiyu/FastDeploy that referenced this pull request Sep 2, 2025

[v1loader]Reduce EB300B model loading time (PaddlePaddle#3700)

78958bc

* speed up eb45 * update

Jiang-Jia-Jun pushed a commit that referenced this pull request Sep 3, 2025

[v1loader]Reduce EB300B model loading time (#3700) (#3810)

f975f7d

* speed up eb45 * update

bukejiyu added a commit to bukejiyu/FastDeploy that referenced this pull request Sep 3, 2025

[v1loader]Reduce EB300B model loading time (PaddlePaddle#3700)

508b7f2

* speed up eb45 * update

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[v1loader]Reduce EB300B model loading time #3700

[v1loader]Reduce EB300B model loading time #3700

Uh oh!

bukejiyu commented Aug 28, 2025 •

edited

Loading

Uh oh!

paddle-bot bot commented Aug 28, 2025

Uh oh!

codecov-commenter commented Aug 28, 2025

Uh oh!

yuanlehome Sep 1, 2025

Uh oh!

bukejiyu Sep 1, 2025

Uh oh!

yuanlehome Sep 1, 2025

Uh oh!

bukejiyu Sep 1, 2025

Uh oh!

yuanlehome Sep 1, 2025

Uh oh!

bukejiyu Sep 1, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[v1loader]Reduce EB300B model loading time #3700

[v1loader]Reduce EB300B model loading time #3700

Uh oh!

Conversation

bukejiyu commented Aug 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

paddle-bot bot commented Aug 28, 2025

Uh oh!

codecov-commenter commented Aug 28, 2025

Codecov Report

Uh oh!

yuanlehome Sep 1, 2025

Choose a reason for hiding this comment

Uh oh!

bukejiyu Sep 1, 2025

Choose a reason for hiding this comment

Uh oh!

yuanlehome Sep 1, 2025

Choose a reason for hiding this comment

Uh oh!

bukejiyu Sep 1, 2025

Choose a reason for hiding this comment

Uh oh!

yuanlehome Sep 1, 2025

Choose a reason for hiding this comment

Uh oh!

bukejiyu Sep 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

bukejiyu commented Aug 28, 2025 •

edited

Loading

bukejiyu Sep 1, 2025 •

edited

Loading