-
Notifications
You must be signed in to change notification settings - Fork 59
Closed
Labels
Description
This issue proposes a new opaque device-specific storage type in WebNN, MLBuffer. MLBuffer is a backend-agnostic storage type (CPU, GPU, NPU, etc) which can be used in WebNN operations.
MLBuffer would be the solution to:
- Give WebNN developer control of device-storage to avoid round-trips to/from CPU.
- Could be extended to export/import to support WebNN interop with web APIs.
Construction/Destruction
typedef [EnforceRange] unsigned long long MLSize64;
dictionary MLBufferDescriptor {
required MLSize64 size;
};
[Exposed=(Window, DedicatedWorker), SecureContext]
interface MLContext {
MLBuffer createBuffer(MLBufferDescriptor descriptor);
};- Layout of
MLBufferis always known (and linear access is assumed).
typedef unsigned long long MLSize64Out;
[Exposed=(Window, DedicatedWorker)]
interface MLBuffer {
[CallWith=Isolate] void destroy();
readonly attribute MLSize64Out size;
}- WebNN developers should prefer calling Destroy(), vs relying on GC, for predictable device memory usage.
- Destroy() gets called on the context timeline but doesn't actually release until the device signals completion.
Upload/Download tensor data
[Exposed=(Window, DedicatedWorker), SecureContext]
interface MLContext {
undefined writeBuffer(
MLBuffer dstBuffer,
MLSize64 dstOffset,
AllowSharedBufferSource srcData,
optional MLSize64 srcOffset = 0,
optional MLSize64 srcSize);
[Exposed=(Window)]
Promise<ArrayBuffer> readBuffer(
MLBuffer srcBuffer,
MLSize64 srcOffset,
MLSize64 srcSize);
[Exposed=(DedicatedWorker)]
void readBufferSync(
MLBuffer srcBuffer,
MLSize64 srcOffset,
MLSize64 srcSize,
AllowSharedBufferSource dstData);
};- Transfer operations will execute on the device timeline in the same order they were enqueued on the context timeline.
- A copy of
srcDatais always made and returns control back to the web developer immediately. - For synchronous compute, use the read-back functions for window and workers, async and sync, respectively.
Binding to graphs
dictionary MLBufferView {
required MLBuffer buffer;
MLSize64 offset = 0;
MLSize64 size;
};
typedef record<DOMString, MLBufferView> MLNamedMLBufferViews;
undefined dispatch(
MLGraph graph, MLNamedMLBufferViewsinputs, MLNamedMLBufferViews outputs);- Buffer usage is always assumed on first access (ex. passed as
outputsassumes output usage). - WebNN developer must call readBuffer() to get a resulting output ML buffer back after compute().
const bufferA = new Float32Array(4).fill(1.0);
const bufferB = new MLBuffer({size:4});
const inputs = {'A': bufferA};
const outputs = {'B': bufferB};
context.dispatch(graph, inputs, outputs);
context.readBuffer(bufferB);Edits:
- 12/15: added MLBuffer
dispatchinstead of overloadingcompute()per https://www.w3.org/2023/12/14-webmachinelearning-minutes.html - 12/15: fixed createBuffer return - should of been non-optional.
- 1/10: edit to rename MLNamedMLBufferViews => MLNamedMLBufferResourceViews
- 1/10: added readBufferSync
- 1/17: renamed MLBufferResource => MLBuffer
Reactions are currently unavailable