-
Notifications
You must be signed in to change notification settings - Fork 3.7k
Closed
Labels
ep:WebGPUort-web webgpu providerort-web webgpu providermodel:transformerissues related to a transformer model: BERT, GPT2, Hugging Face, Longformer, T5, etc.issues related to a transformer model: BERT, GPT2, Hugging Face, Longformer, T5, etc.platform:webissues related to ONNX Runtime web; typically submitted using templateissues related to ONNX Runtime web; typically submitted using template
Description
This issue is for tracking WebGPU operators. It includes the following info:
- a list of supported operators
- a list of WIP operator implementations
- info about problems/correctness/performance specific to a certain oprator.
Supported operators
Currently work in progress operators
ops needed for segment anything:
| OpType | Assigned To | Comments | PR |
|---|---|---|---|
| ArgMax.float | @guschmue | #16882 | |
| Cast.bool | @fs-eire | ||
| Equal.float | @fs-eire | ||
| Einsum.float | @sajandhy | ||
| Gather.float | @dakenf | #16855 | |
| LayerNormalization.float | @dakenf | #16830 | |
| Not.bool | @jchen351 | #16891 | |
| Softmax.float | @guschmue | #16882 |
assuming above ops are implemented, ops missing for segment anything encoder:
(offline script would replace int64 that is not supported in webgpu with int32)
| OpType | Assigned To | Comments | PR |
|---|---|---|---|
| Concat.int32 | |||
| Einsum.float | |||
| Gather.int32 | |||
| Pad.float | |||
| Slice.int32 | |||
| Sub.int32 | |||
| Transpose.int32 |
assuming above ops are implemented, ops missing for t5 encoder:
(offline script would replace int64 that is not supported in webgpu with int32)
| OpType | Assigned To | Comments | PR |
|---|---|---|---|
| Abs.int32 | |||
| Add.int32 | |||
| ConstantOfShape.int32 | |||
| Greater.int32 | |||
| Less.int32 | |||
| Log.float | |||
| Min.int32 | |||
| Mul.int32 | |||
| Range.int32 | |||
| Reshape.int32 | |||
| Shape.int32 | |||
| Sub.int32 | |||
| Where.bool |
assuming above ops are implemented, ops missing for t5 decoder:
(offline script would replace int64 that is not supported in webgpu with int32)
| OpType | Assigned To | Comments | PR |
|---|---|---|---|
| If.bool | |||
| LessOrEqual.int32 | |||
| Tile.int32 |
assuming above ops are implemented, ops missing for dolly-v2-3b:
(offline script would replace int64 that is not supported in webgpu with int32, assumes fp32 for now which is not going to work)
| OpType | Assigned To | Comments | PR |
|---|---|---|---|
| Div.int32 | |||
| GatherElements.float | |||
| Mul.int32 | |||
| Neg.int32 | |||
| Slice.bool | |||
| Slice.int32 | |||
| Tile.float | |||
| Transpose.float |
Failing operators
Operators that need to be optimized
| OpType | Assgined To | Comments |
|---|---|---|
| FusedConv | @guschmue | need to fuse conv and activation |
| Conv | TBD | optimize the 1 time filter transpose at init |
| FusedMatmul | TBD | |
| FusedGemm | TBD |
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
ep:WebGPUort-web webgpu providerort-web webgpu providermodel:transformerissues related to a transformer model: BERT, GPT2, Hugging Face, Longformer, T5, etc.issues related to a transformer model: BERT, GPT2, Hugging Face, Longformer, T5, etc.platform:webissues related to ONNX Runtime web; typically submitted using templateissues related to ONNX Runtime web; typically submitted using template