-
Notifications
You must be signed in to change notification settings - Fork 683
[v1loader]Reduce EB300B model loading time #3700
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Thanks for your contribution! |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## develop #3700 +/- ##
==========================================
Coverage ? 87.50%
==========================================
Files ? 4
Lines ? 16
Branches ? 3
==========================================
Hits ? 14
Misses ? 0
Partials ? 2
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
29270b3 to
497034f
Compare
| if shard_id is None: | ||
| # 1.gate up fused in disk | ||
| model_format = getattr(param, "model_format", "") | ||
| is_opensource_weight = model_format == "torch" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is_opensource_weight这个命名不妥,ernie不也是所谓的“开源权重”吗?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
改成 is_torch_model了
| per_rank = output_size // 2 | ||
| start = self.tp_rank * per_rank | ||
| loaded_weight_shard_gate = slice_fn( | ||
| loaded_weight, is_opensource_weight ^ SHARD_ID_TO_SHARDED_DIM["gate"], start, start + per_rank |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里取异或操作具体是想表达什么意思呢?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
删掉了 transpose,所以切分的维度 要取反,异或为了取反
| if self.tp_size > 1: | ||
| is_opensource_weight = model_format == "torch" | ||
| if self.tp_size > 1 and not is_sharded: | ||
| weight_shard_dim = is_opensource_weight ^ shard_dim |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
其他地方都还是shard_dim的命名,那这里改为weight_shard_dim是有什么特殊的含义吗?weight_loader里load的不是weight还会是其他的吗?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
因为删了 transpose ,disk权重的切分维度 和 paramter的切分维度刚好相反,那我改成tp_shard_dim 吧
* speed up eb45 * update
* speed up eb45 * update
减少eb300B模型loading耗时 10min->3min