-
Notifications
You must be signed in to change notification settings - Fork 59
Description
Context
This arose during prototyping additional models and during Chromium code review and ORT review.
Overview
[1,2,3] - 3D tensor shape
[1,2] - 2D tensor shape
[1] - 1D tensor shape
[] - 0D tensor shape (scalar)
A 0D tensor shape vs a 1D tensor with a single element are semantically and functionally distinct. For simple operators (e.g. Add/Sin), treating a scalar as if it was a 1D array with a single element yields the same functional result, but for more complex operators where the rank matters (e.g. ReduceSum with all axes and keepDimensions = false, or Gather where the output rank is computed as input.rank + indices.rank - 1...), treating the two as synonymous is problematic and has been causing a multitude of problems with mapping certain ONNX models to WebNN.
References
Every known ML library represents 0D scalars via the shape [] (or ()):
Numpy
import numpy
x = numpy.array(42)
y = numpy.add(x, x)
print("value:", y)
print("shape:", y.shape)
# Prints:
# value: 84
# shape: ()TensorFlow
import tensorflow as tf
x = tf.constant(42, dtype=tf.float32)
y = tf.add(x, x);
print("value:", y)
print("shape:", y.shape)
# Prints:
# value: tf.Tensor(84.0, shape=(), dtype=float32)
# shape: ()PyTorch
import torch
x = torch.tensor(42, dtype=torch.float)
y = torch.add(x, x)
print("value:", y)
print("shape:", y.shape)
# Prints:
# value: tensor(84.)
# shape: torch.Size([])ONNX
import onnx
# Scalar via [].
x = onnx.helper.make_tensor(
"value", onnx.TensorProto.FLOAT, [], [42]
)SafeTensorss
The SafeTensors file format (commonly used with Stable Diffusion models for custom weights) explicitly allows 0D scalars and 0-size tensors.
XNNPack
Bin Miao says below that XNNPack works.
Implementation experience
In my Chromium fork, I just needed to change 2 lines, deleting 1 errant validation statement in the graph builder, and adding 1 in the DML backend (TensorDesc constructors sets a minimum rank of 1) for the rest of it to work. Then scalar tests passed:
// 1. Create a computational graph
const inputTensorDesc = {type: 'float32', dimensions: []};
const constantDesc = { type: 'float32', dimensions: []};
const A = graphBuilder.input('A', inputTensorDesc);
const B = graphBuilder.constant(constantDesc, new Float32Array([42.0]));
const C = graphBuilder.add(A, B);
LogTensorShapes(graphBuilder, {"A": A, "B": B, "C": C});
// 2. Build the graph into an executable.
const graph = await graphBuilder.build({'C': C});
// 3. Bind inputs to the graph and execute for the result.
const bufferA = new Float32Array([42]);
const bufferC = new Float32Array(1);
const inputs = {'A': bufferA};
const outputs = {'C': bufferC};
await mlContext.compute(graph, inputs, outputs);
console.log(C);
console.log(C.shape()); // Private extension needed for debuggingSpecification Update
The problematic statement in the spec is If dimensions.length is 0, return false.,
https://www.w3.org/TR/webnn/#api-mloperand-create. Delete that line, and the rest should work.
The computation for the element count and byte length is even already correct (it would return 1 element for empty dimensions) https://www.w3.org/TR/webnn/#api-mloperanddescriptor:
The byte length of an MLOperandDescriptor desc is the value returned by the following steps:
Let elementLength be 1.For each dimension of desc.dimensions:
Set elementLength to elementLength × dimension.
Let elementSize be the element size of one of the ArrayBufferView types that matches desc.type according to this table.
Return elementLength × elementSize.
