Skip to content

Comments

[js/webgpu] add kernel Not and Equal#17306

Merged
guschmue merged 2 commits intomainfrom
fs-eire/webgpu-not-equal
Aug 28, 2023
Merged

[js/webgpu] add kernel Not and Equal#17306
guschmue merged 2 commits intomainfrom
fs-eire/webgpu-not-equal

Conversation

@fs-eire
Copy link
Contributor

@fs-eire fs-eire commented Aug 26, 2023

Description

This PR adds kernel implementation for operator "Not" and "Equal". Also removed download cache in gpu data manager.

Why removing download cache
The following test case failed. ("Or" is on CPU, "Greater" and "Equal" are on JSEP)
image
after debugging, I found that both "Equal" and "Greater" are using the same output GPU Data ID. This is because when ORT executes the graph, it first run "Equal", allowing its shader to write into GPU Data ID 2; then a Gpu2Cpu copy for it is issued (because currently "Or" is on CPU EP); at this point, ORT thinks GPU Data ID=2 is free to use; so it reuse it as output for "Greater". This means there is no allocation for output of "Greater" kernel, and both kernel writes to GPU Data ID=2.

For gpu data manager, there will be 2 downloads from the same GPU buffer. Previously I think this is a waste of resource so I cached the data. But now it shoes that we need to perform 2 downloads because the GPU data is already different. The download data cache should be removed.

Motivation and Context

@guschmue guschmue merged commit bb18713 into main Aug 28, 2023
@guschmue guschmue deleted the fs-eire/webgpu-not-equal branch August 28, 2023 02:50
@guschmue
Copy link
Contributor

tested with segment-anything decoder

@guschmue guschmue added the ep:WebGPU ort-web webgpu provider label Aug 28, 2023
kleiti pushed a commit to kleiti/onnxruntime that referenced this pull request Mar 22, 2024
### Description
This PR adds kernel implementation for operator "Not" and "Equal". Also
removed download cache in gpu data manager.

**Why removing download cache**
The following test case failed. ("Or" is on CPU, "Greater" and "Equal"
are on JSEP)

![image](https://github.com/microsoft/onnxruntime/assets/7679871/8d9798ad-2703-4fb9-907e-ff716c67d0b2)
after debugging, I found that both "Equal" and "Greater" are using the
same output GPU Data ID. This is because when ORT executes the graph, it
first run "Equal", allowing its shader to write into GPU Data ID 2; then
a Gpu2Cpu copy for it is issued (because currently "Or" is on CPU EP);
at this point, ORT thinks GPU Data ID=2 is free to use; so it reuse it
as output for "Greater". This means there is no allocation for output of
"Greater" kernel, and both kernel writes to GPU Data ID=2.

For gpu data manager, there will be 2 downloads from the same GPU
buffer. Previously I think this is a waste of resource so I cached the
data. But now it shoes that we need to perform 2 downloads because the
GPU data is already different. The download data cache should be
removed.


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
siweic0 pushed a commit to siweic0/onnxruntime-web that referenced this pull request May 9, 2024
### Description
This PR adds kernel implementation for operator "Not" and "Equal". Also
removed download cache in gpu data manager.

**Why removing download cache**
The following test case failed. ("Or" is on CPU, "Greater" and "Equal"
are on JSEP)

![image](https://github.com/microsoft/onnxruntime/assets/7679871/8d9798ad-2703-4fb9-907e-ff716c67d0b2)
after debugging, I found that both "Equal" and "Greater" are using the
same output GPU Data ID. This is because when ORT executes the graph, it
first run "Equal", allowing its shader to write into GPU Data ID 2; then
a Gpu2Cpu copy for it is issued (because currently "Or" is on CPU EP);
at this point, ORT thinks GPU Data ID=2 is free to use; so it reuse it
as output for "Greater". This means there is no allocation for output of
"Greater" kernel, and both kernel writes to GPU Data ID=2.

For gpu data manager, there will be 2 downloads from the same GPU
buffer. Previously I think this is a waste of resource so I cached the
data. But now it shoes that we need to perform 2 downloads because the
GPU data is already different. The download data cache should be
removed.


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ep:WebGPU ort-web webgpu provider

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants