Allow creating mappable buffer with more usages as optional features#5108
Allow creating mappable buffer with more usages as optional features#5108Jiawei-Shao wants to merge 6 commits intogpuweb:mainfrom
Conversation
|
Previews, as seen when this build job started (44266f8): |
spec/index.bs
Outdated
| Changes to the returned {{ArrayBuffer}} will be stored in the buffer after | ||
| {{GPUBuffer/unmap()}} is called. | ||
|
|
||
| Note: Write-only mapping will never return values produced by the GPU. |
There was a problem hiding this comment.
Why is this restriction necessary? Won't it result in an extra copy of the data for write only mappings on UMA systems?
I think it would be preferable to state that reading the contents of a write only mappings are undefined (or zero-ed?) in this case.
There was a problem hiding this comment.
Won't it result in an extra copy of the data for write only mappings on UMA systems?
It is mainly because currently in Chromium a copy is always needed because we cannot access the mapped pointer from GPU resources directly in the JS side. We can discuss more about this issue in the WG meeting.
There was a problem hiding this comment.
We have the same restriction in WebKit too, perhaps this is a non-issue.
There was a problem hiding this comment.
I think it would be preferable to state that reading the contents of a write only mappings are undefined (or zero-ed?) in this case.
We couldn't make it fully undefined, at most we could add a third option for what gets returned. Hard to say if the performance would be any better better if we allowed it. It would require cache flushes or something to make sure you don't get unsafe undefined values.
And of course for portability reasons we really don't want anyone to rely on data getting read back on write-only mappings. But that's probably not a huge concern.
GPU Web WG 2025-03-19 Atlantic-time
|
spec/index.bs
Outdated
| {{ArrayBuffer}} containing the buffer's current values. Changes to the returned | ||
| {{ArrayBuffer}} will be stored in the {{GPUBuffer}} after {{GPUBuffer/unmap()}} is called. | ||
| {{ArrayBuffer}} containing the default initialized data (zeros) or data written by the | ||
| webpage during a previous mapping. |
There was a problem hiding this comment.
Just to be clear, this is allowing either of two behaviors?
For better portability we could clear the map region every time. I wonder, would that be too expensive? (How much of the benefit would we lose from avoiding the copyB2B in today's map-write-then-copy pattern?) The region could even be cleared before mapping by the GPU (which might have higher memory bandwidth than the CPU).
Maybe if there are really cases where effectively READ|WRITE would be more efficient than WRITE we could have the browser provide a hint about which one to use.
There was a problem hiding this comment.
Just to be clear, this is allowing either of two behaviors?
This comes from a note in the current SPEC. Here I mean for the first time buffer.getMappedData() is called after a write-only mapping, the data in the returned array buffer should be all zeros, and since the second time the array buffer will contain the data written by the webpage during a previous mapping.
or better portability we could clear the map region every time. I wonder, would that be too expensive?
Obviously clearing the map region every time is expensive and unnecessary. With the mentioned behavior we don't need to either clear or read the data back from the GPU when non-triply mapping is used.
Maybe if there are really cases where effectively READ|WRITE would be more efficient than WRITE we could have the browser provide a hint about which one to use.
When triply mapping is supported on CPU-cached UMA (e.g. on Intel iGPUs), we can directly get the GPU data through buffer.getMappedData() without any other operations. So I add the feature "buffer-map-write-with-extended-usages-and-gpu-data" for the best performance of data uploading on this architecture.
The feature "buffer-map-write-with-extended-usages" also works for CPU-cached UMA with non-triply mapping, and it is for the best performance of data uploading on non-triply mapping on non-CPU-cached UMA and ReBAR, where only write in sequence or memcpy is much more performant compared with randomly write.
Maybe if there are really cases where effectively READ|WRITE would be more efficient than WRITE we could have the browser provide a hint about which one to use.
I feel it strange to use READ together with WRITE because MAP_READ should keep data on GPU unchanged. For such use case I decide to use "MAP_WRITE with current GPU data in the array buffer" instead.
mwyrzykowski
left a comment
There was a problem hiding this comment.
Changes look good, maybe name could change but fine with any reasonable name.
GPU Web WG 2025-03-25/26 Pacific-time
|
Detailed investigation can be found here.
Fixed: #2388