Add a non-normative table of operators by category by anssiko · Pull Request #868 · webmachinelearning/webnn

anssiko · 2025-06-11T17:04:11Z

This non-normative table with functional grouping of operators complements the normative definition of operators that are currently in alphabetical order.

I tried to check we're covering all the operators, but it is very likely I may have missed some. ~~I proactively added roundEven #859.~~ Feedback welcome whether this makes sense and what would make for a good categorization.

The caveat is this table needs to be manually maintained alongside the normative definition, but that'd be just one extra line.

Preview | Diff

And dedupe ids.

fdwr

Thanks Anssi. Yes, I think it's useful to have an upfront table that categorizes operators like this.

index.bs

fdwr · 2025-06-11T18:49:13Z

index.bs

+
+*This section is non-normative.*
+
+The WebNN API defines a set of operators required by well-known CNN and RNN, transformer and generative models that address key [[#usecases-application]]. The details of each operator are defined in the normative sections of this specification, in alphabetical order by the operator name. These operators are grouped into categories based on their functionality in the following non-normative table to give a functional overview of the API surface.


The hardest challenge with putting them all into mutually distinct groups is that some operators belong to multiple sets. For example, if we were to add sinh & cosh to WebNN, then surely they would go along with all the other math operators, but sinh/cosh/tanh are a trio, and so it would be weird to have tanh in a completely different category. Similarly, clamp is definitely a math function along with its siblings min and max, but quite rarely it's also useable for activation. So is it more a math function or an activation function? Neither - it's both. I tried putting operators into distinct categories in my listing here and eventually realized it was futile 😉, instead giving them multiple tags.

I have to admin I struggled with this part too. Some of the groupings inherited from another exercise with @huningxin where we had to simplify things to another audience.

For web specs, dynamic elements might not be appropriate for a11y, printing and other reasons. Otherwise I'd happily port over your dynamic table to the spec (I had forgotten about this fantastic resource!).

Would a reasonable tradeoff for a static representation be to add those operators that are clearly multipurpose to multiple categories? I experimented with this idea using clamp. For those operators that are primarily associated with a particular category e.g. for historical reasons, say math functions, but are also used for other purposes, say for masking, I'd still keep them in maths. Not perfect, but...

Yes, that's what I was thinking - just include tanh and clamp under math but also activation.

OK, let's go with this simple solution. Updated to include tanh and clamp under math too.

fdwr · 2025-06-11T20:25:37Z

index.bs

+      {{MLGraphBuilder/lesser()}},
+      {{MLGraphBuilder/lesserOrEqual()}},
+      {{MLGraphBuilder/logicalNot()}},
+      {{MLGraphBuilder/where()}},


🤔 I suppose the selection operator (aka TOSA select) could be thought of as a logical operator, given it takes boolean input (all the others produce boolean output), but I always thought of it as more of a data reorganization operator like gather, gathering data respectively from input a or b, depending on the boolean indices (e.g. pseudocode kinda like gather(concat([a, b], axis:rank), cast(indices, "uint32"), axis:rank)), and where isn`t under the Element-wise logical operations list either.

I parked where to "Tensor rearrangement" for now.

FYI, CoreML puts select into "control flow" category: https://apple.github.io/coremltools/source/coremltools.converters.mil.mil.ops.defs.html#coremltools.converters.mil.mil.ops.defs.iOS15.control_flow.select

FYI, CoreML puts select into "control flow" category:

I can see why someone might think that, given it feels like a C ternary operator (a ? b : c), but functionally it's more like a gatherElements with 0/1 indices, as it doesn't cause any control flow change in the graph execution (unlike say ONNX If/While/Scan or CoreML cond/while_loop), and webNN doesn't support any true control flow operators currently, meaning it would be a weirdly isolated category.

webNN doesn't support any true control flow operators currently, meaning it would be a weirdly isolated category.

That's true.

TOSA places select into "Elementwise Ternary Operators" category: https://www.mlplatform.org/tosa/tosa_spec.html#_elementwise_ternary_operators

Should we use "element-wise binary", "element-wise logical" and "element-wise unary" categories? They are already used in the normative text.

This table currently proposes a higher-level functional grouping, and as such abstracts out "element-wise" from "logical" ops and groups both "element-wise binary" and "element-wise unary" ops under "Mathematics".

I'd propose we try to go with a higher-level grouping first, and let the normative text offer more detail where appropriate.

fdwr · 2025-06-11T20:37:53Z

index.bs

+      {{MLGraphBuilder/softplus()}},
+      {{MLGraphBuilder/softsign()}},
+      {{MLGraphBuilder/tanh()}},
+      {{MLGraphBuilder/triangular()}}


The triangular matrix masking function doesn't feel like activation, but I don't know what category it fits in either 🤔. Are there cases where trilu acts as activation that I'm unaware of?

Does it make sense to add a new "Tensor masking" / "Matrix masking" category for operators that are primarily(?) used for masking?

Experimented with this idea using triangular.

Also "tensor rearrangement" op? FYI, CoreML puts band_part into "tensor operation" category: https://apple.github.io/coremltools/source/coremltools.converters.mil.mil.ops.defs.html#coremltools.converters.mil.mil.ops.defs.iOS15.tensor_operation.band_part

Does it make sense to add a new "Tensor masking"
Also "tensor rearrangement" op?

Content with either.

Moved triangular to "Tensor manipulation" (was "Tensor rearrangement") so we can drop the "Tensor masking" category, which I did.

Incorporate review feedback and suggestions.

anssiko · 2025-06-12T08:51:36Z

@fdwr, much thanks for your suggestions. I pushed an update with a new iteration to test some new ideas based on your feedback.

huningxin · 2025-06-12T14:37:03Z

index.bs

+      {{MLGraphBuilder/slice()}},
+      {{MLGraphBuilder/split()}},
+      {{MLGraphBuilder/transpose()}},
+      {{MLGraphBuilder/resample2d()}},


I feel resample2d is not a tensor rearrangement op, because it doesn't just rearrange the existing values, but generate new values. FYI, CoreML puts "resample" into "image resizing" category: https://apple.github.io/coremltools/source/coremltools.converters.mil.mil.ops.defs.html#coremltools.converters.mil.mil.ops.defs.iOS16.image_resizing.resample

because it doesn't just rearrange the existing values, but generate new values

That is true for linear interpolation.

Notice for nearest neighbor that resample is functionally identical to integral upsampling (or blockwise expansion, like that used in dequantizeLinear). One can even implement dequantizeLinear's blockwise expansion step via a backend's resampling operator + NN (and I was thinking to augment expand to accept block sizes for better functional decomposition of Q/DQ). So, there is clearly a familial relationship between spatial expansion and resampling, along with other spatial interpolation functions like ROI alignment and warped grid sampling.

into "image resizing" category:

It's true that these spatial interpolation functions are often used with imagery, but 1D resampling is useful for audio rescaling too (so I wouldn't limit it to just "image resizing"). Maybe there's a category name that better encompasses all these spatial remappings/reprojections? Anssi originally had the category name of "Tensor manipulation", which he changed per my comment because I found it a little too generic, but would you find that amenable? "Spatial modification"? "Spatial manipulation"? "Tensor reprojection"?

but 1D resampling is useful for audio rescaling too (so I wouldn't limit it to just "image resizing")

Agreed. For 2D resmaple/resize, TOSA also places it under "image operators" category: https://www.mlplatform.org/tosa/tosa_spec.html#_image_operators

Switched back to the more generic "Tensor manipulation" name for now so that resample feels welcome.

fdwr

Thanks for the updates AK. Looks good to me after tanh/clamp dual citizenship and deciding on a name for the spatial remapping category.

fdwr · 2025-06-12T20:36:40Z

index.bs

+
+*This section is non-normative.*
+
+The WebNN API defines a set of operators required by well-known CNN and RNN, transformer and generative models that address key [[#usecases-application]]. The details of each operator are defined in the normative sections of this specification, in alphabetical order by the operator name. These operators are grouped into categories based on their functionality in the following non-normative table to give a functional overview of the API surface.


Yes, that's what I was thinking - just include tanh and clamp under math but also activation.

fdwr · 2025-06-12T20:55:56Z

index.bs

+      {{MLGraphBuilder/slice()}},
+      {{MLGraphBuilder/split()}},
+      {{MLGraphBuilder/transpose()}},
+      {{MLGraphBuilder/resample2d()}},


because it doesn't just rearrange the existing values, but generate new values

That is true for linear interpolation.

Notice for nearest neighbor that resample is functionally identical to integral upsampling (or blockwise expansion, like that used in dequantizeLinear). One can even implement dequantizeLinear's blockwise expansion step via a backend's resampling operator + NN (and I was thinking to augment expand to accept block sizes for better functional decomposition of Q/DQ). So, there is clearly a familial relationship between spatial expansion and resampling, along with other spatial interpolation functions like ROI alignment and warped grid sampling.

into "image resizing" category:

It's true that these spatial interpolation functions are often used with imagery, but 1D resampling is useful for audio rescaling too (so I wouldn't limit it to just "image resizing"). Maybe there's a category name that better encompasses all these spatial remappings/reprojections? Anssi originally had the category name of "Tensor manipulation", which he changed per my comment because I found it a little too generic, but would you find that amenable? "Spatial modification"? "Spatial manipulation"? "Tensor reprojection"?

fdwr · 2025-06-12T21:11:06Z

index.bs

+      {{MLGraphBuilder/lesser()}},
+      {{MLGraphBuilder/lesserOrEqual()}},
+      {{MLGraphBuilder/logicalNot()}},
+      {{MLGraphBuilder/where()}},


FYI, CoreML puts select into "control flow" category:

I can see why someone might think that, given it feels like a C ternary operator (a ? b : c), but functionally it's more like a gatherElements with 0/1 indices, as it doesn't cause any control flow change in the graph execution (unlike say ONNX If/While/Scan or CoreML cond/while_loop), and webNN doesn't support any true control flow operators currently, meaning it would be a weirdly isolated category.

fdwr · 2025-06-12T21:11:37Z

index.bs

+      {{MLGraphBuilder/softplus()}},
+      {{MLGraphBuilder/softsign()}},
+      {{MLGraphBuilder/tanh()}},
+      {{MLGraphBuilder/triangular()}}


Does it make sense to add a new "Tensor masking"
Also "tensor rearrangement" op?

Content with either.

index.bs

huningxin · 2025-06-13T00:47:43Z

index.bs

+      {{MLGraphBuilder/gatherElements()}},
+      {{MLGraphBuilder/scatterElements()}},
+      {{MLGraphBuilder/gatherND()}},
+      {{MLGraphBuilder/scatterND()}},


Both CoreML and TOSA have a dedicated "gather/scatter" category:
https://apple.github.io/coremltools/source/coremltools.converters.mil.mil.ops.defs.html#module-coremltools.converters.mil.mil.ops.defs.iOS15.scatter_gather
https://www.mlplatform.org/tosa/tosa_spec.html#_scattergather_operators

huningxin · 2025-06-13T00:51:01Z

index.bs

+      </td>
+    </tr>
+    <tr>
+      <td>Tensor casting</td>


FYI, TOSA groups cast and scale (equivalent to quantizeLinear/dequantizeLinear) to "Type Conversion" category: https://www.mlplatform.org/tosa/tosa_spec.html#_type_conversion

I'm happy either way on this, having Q/DQ in their own category and putting them as part of data type conversion.

Incorporate more review feedback and suggestions.

anssiko · 2025-06-13T10:53:12Z

@fdwr @huningxin thanks for your suggestions. I pushed another update. PTAL if I missed any feedback: fed3ae8

(This has been a fun exercise and a reminder naming in this field is hard and does not always agree with other fields, statistics, physics, information theory... While "Web API for chains of differentiable, parameterized geometric functions” would be a more accurate name, we call this spec the "Web Neural Network API" ;-))

fdwr

👍

anssiko · 2025-06-17T09:53:11Z

I will merge this first stab to shorten the open PR queue. Thank you for your insights @fdwr and @huningxin!

Please feel free to continue adjust this table as you see fit. A natural check point for adjustment is when new ops are being added or existing removed, i.e. the moment when this table needs to be updated manually. I hope that won't become a chore but is actually a useful exercise to reason about our op coverage.

SHA: f2f5f93 Reason: push, by anssiko Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

…g#868) With contributions from Ningxin and Dwayne.

Add a non-normative table of operators by category

5732866

anssiko requested review from fdwr and huningxin June 11, 2025 17:04

Remove not-yet landed roundEven() from the non-normative table

05d6e73

And dedupe ids.

fdwr reviewed Jun 11, 2025

View reviewed changes

Update the non-normative table of operators by category

cd90b08

Incorporate review feedback and suggestions.

huningxin reviewed Jun 12, 2025

View reviewed changes

fdwr approved these changes Jun 12, 2025

View reviewed changes

huningxin reviewed Jun 13, 2025

View reviewed changes

Update the non-normative table of operators by category

fed3ae8

Incorporate more review feedback and suggestions.

fdwr approved these changes Jun 13, 2025

View reviewed changes

anssiko merged commit f2f5f93 into main Jun 17, 2025
2 checks passed

anssiko deleted the ops-by-category branch June 17, 2025 09:54

github-actions bot added a commit that referenced this pull request Jun 17, 2025

Add a non-normative table of operators by category (#868)

0433e32

SHA: f2f5f93 Reason: push, by anssiko Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

zolkis pushed a commit to zolkis/webnn that referenced this pull request Jun 17, 2025

Add a non-normative table of operators by category (webmachinelearnin…

5a41461

…g#868) With contributions from Ningxin and Dwayne.

dontcallmedom mentioned this pull request Jan 15, 2026

CR Snapshot Update Request for Web Neural Network API - webnn w3c/transitions#769

Closed


		This section is non-normative.

		The WebNN API defines a set of operators required by well-known CNN and RNN, transformer and generative models that address key [[#usecases-application]]. The details of each operator are defined in the normative sections of this specification, in alphabetical order by the operator name. These operators are grouped into categories based on their functionality in the following non-normative table to give a functional overview of the API surface.

Comments

Conversation

anssiko commented Jun 11, 2025 • edited by pr-preview bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fdwr left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

fdwr Jun 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

huningxin Jun 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

anssiko commented Jun 12, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

fdwr Jun 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

fdwr left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

fdwr Jun 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

anssiko commented Jun 13, 2025

Uh oh!

fdwr left a comment

anssiko commented Jun 11, 2025 •

edited by pr-preview bot

Loading

fdwr Jun 11, 2025 •

edited

Loading

huningxin Jun 13, 2025 •

edited

Loading

fdwr Jun 12, 2025 •

edited

Loading

fdwr Jun 12, 2025 •

edited

Loading