Add LSTM to the operator list by wchao1115 · Pull Request #321 · webmachinelearning/webnn

wchao1115 · 2022-12-23T19:42:12Z

Finally had a chance to do this. Adding lstm and lstmCell as additional operations in the spec. Since the addition of gru and gruCell there have been numerous questions about LSTM. As a widely popular recurrent network, supporting LSTM along with GRU is generally seen as important for API completeness.

This change also made some editorial cleanups of existing wordings for GRU, and proposes a name change of the existing MLOperator to MLActivation to reflect its true purpose in the spec as a generic definition of an activation function i.e. not every operation can be an activation function, etc.

Preview | Diff

huningxin

@wchao1115 , thanks for your great work. lstm is an important rnn network, webnn should support it.

Sorry for the late reply, I was out sick for a few days.

index.bs

huningxin · 2022-12-28T02:57:45Z

index.bs

+        - *weight*: an {{MLOperand}}. The 2-D input weight tensor of shape [3 * hidden_size, input_size]. The ordering of the weight vectors in the first dimension of the tensor shape is specified according to the *options.layout* argument.
+        - *recurrentWeight*: an {{MLOperand}}. The 2-D recurrent weight tensor of shape [3 * hidden_size, hidden_size]. The ordering of the weight vectors in the first dimension of the tensor shape is specified according to the *options.layout* argument.
        - *hiddenState*: an {{MLOperand}}. The 2-D input hidden state tensor of shape [batch_size, hidden_size].
        - *hiddenSize*: a {{long}} scalar. The value of the second dimension of the output tensor shape. It indicates the number of features in the hidden state.


Should hiddenSize be an unsigned long scalar?

You may want to change hiddenSize of gruCell to unsigned long as well.

…type.

wchao1115 · 2023-01-02T06:26:06Z

New commit should address all the feedbacks here. Please take a look.

huningxin

Thanks for the update, you may may miss to change two hiddenSize to unsigned long scalar. Others look good to me.

huningxin · 2023-01-03T01:27:29Z

index.bs

+        - *weight*: an {{MLOperand}}. The 2-D input weight tensor of shape [3 * hidden_size, input_size]. The ordering of the weight vectors in the first dimension of the tensor shape is specified according to the *options.layout* argument.
+        - *recurrentWeight*: an {{MLOperand}}. The 2-D recurrent weight tensor of shape [3 * hidden_size, hidden_size]. The ordering of the weight vectors in the first dimension of the tensor shape is specified according to the *options.layout* argument.
        - *hiddenState*: an {{MLOperand}}. The 2-D input hidden state tensor of shape [batch_size, hidden_size].
        - *hiddenSize*: a {{long}} scalar. The value of the second dimension of the output tensor shape. It indicates the number of features in the hidden state.


You may want to change hiddenSize of gruCell to unsigned long as well.

huningxin · 2023-01-03T01:28:01Z

index.bs

+        - *recurrentWeight*: an {{MLOperand}}. The 2-D recurrent weight tensor of shape [4 * hidden_size, hidden_size]. The ordering of the weight vectors in the first dimension of the tensor shape is specified according to the *options.layout* argument.
+        - *hiddenState*: an {{MLOperand}}. The 2-D input hidden state tensor of shape [batch_size, hidden_size].
+        - *cellState*: an {{MLOperand}}. The 2-D input cell state tensor of shape [batch_size, hidden_size].
+        - *hiddenSize*: a {{long}} scalar. The value of the second dimension of the output tensor shape. It indicates the number of features in the hidden state.


huningxin

Looks good with a few nits that I missed in the last review. Sorry and please take a look.

index.bs

fdwr · 2023-01-04T22:30:34Z

index.bs


 ### The gru() method ### {#api-mlgraphbuilder-gru}
-Gated Recurrent Unit [[GRU]] recurrent network using an update gate and a reset gate to compute the hidden state that rolls into the output across the temporal sequence of the Network
+Gated Recurrent Unit [[GRU]] recurrent network uses an update, reset, and new gate to compute the output state that rolls into the output across the temporal sequence of the network


Period at end.

fdwr · 2023-01-04T22:33:44Z

index.bs

@@ -1411,33 +1411,34 @@ dictionary MLGruOptions {
  boolean resetAfter = true;
  boolean returnSequence = false;
  MLRecurrentNetworkDirection direction = "forward";


Since MLRecurrentNetworkDirection is also used by MLLstmOptions too, should this shared definition be outside the gru-specific section?

We don't dedicate a section for an enum. Since gru comes first, I just parked this enum there.

fdwr · 2023-01-04T22:41:00Z

index.bs


    <div class="note">
-    The behavior of this operation when the activations of the update/reset gate and new gate are of the operator types *sigmoid* and *tanh* respectively can be generically emulated from the usage of other operations as follow. However, user agents typically have a more efficient implementation for it, therefore its usage is encouraged from the performance standpoint.
+    The behavior of this operation when the weight layout is the default *"zrn"* layout, and the activation functions of the update/reset gate and new gate are of the operator types *sigmoid* and *tanh* respectively can be generically emulated from the usage of other operations as follow. However, user agents typically have a more efficient implementation for it, therefore its usage is encouraged from the performance standpoint.


"as follow" -> "as follows"

Wording is awkward because it piles on many intermediate phrases before the final verb clause. How about:

"The behavior of this operation can be generically emulated via other operations as shown below, when the weight layout is the default "zrn" layout, and the activation functions of the update/reset gate and new gate are of the operator types sigmoid and tanh respectively."

fdwr · 2023-01-05T00:23:58Z

index.bs

+<div algorithm=lstm>
+    **Arguments:**
+        - *input*: an {{MLOperand}}. The input 3-D tensor of shape [steps, batch_size, input_size].
+        - *weight*: an {{MLOperand}}. The 3-D input weight tensor of shape [num_directions, 4 * hidden_size, input_size]. The ordering of the weight vectors in the second dimension of the tensor shape is specified according to the *options.layout* argument.


Naming inconsistencies:
hidden_size (line 1873) vs hiddenSize (line 1866)
num_directions vs numDirections (line 1893)

Thanks for pointing this out. But I'm not going to address this issue in this PR. It probably needs a round of cleanup.

huningxin

LGTM, thanks for the great work!

anssiko

@wchao1115 thanks for adding these important ops. I'm clearing my review with a few grammar nits only since this PR has been already reviewed quite extensively.

I'll put this PR on our next call agenda and expect we can land this soon for timely inclusion into the CR scope. We explicitly mention LSTM alongside RNN in our current charter, so good to get this in from that point of view too.

anssiko · 2023-01-09T11:05:46Z

index.bs

            - *direction*: a {{MLRecurrentNetworkDirection}}. The processing direction of the input sequence. When set to *"both"*, the size of the first dimension of the weight and the bias tensor shapes must be 2, and the input is processed in both directions.
-            - *layout*: a {{MLRecurrentNetworkWeightLayout}}. The ordering of the weight and bias vectors for the internal gates of GRU, specifically the *update (z)*, *reset (r)*, and *new (n)* gate, as indicated in the second dimension of the weight and bias tensor shape. When not specified, the default layout is *"zrn"*.
-            - *activations*: a sequence of {{MLOperator}}. A pair of activation functions with the first function used for the update and reset gate, and the second used for the new gate. When not specified, it's assumed to be the sigmoid (*"sigmoid"*) and the hyperbolic tangent (*"tanh"*) function respectively.
+            - *layout*: a {{MLGruWeightLayout}}. The ordering of the weight and bias vectors for the internal gates of GRU, specifically the *update (z)*, *reset (r)*, and *new (n)* gate, as indicated in the second dimension of the weight and bias tensor shape. When not specified, the default layout is *"zrn"*.


anssiko · 2023-01-09T11:06:03Z

index.bs

+            - *initialHiddenState*: an {{MLOperand}}. The 3-D initial hidden state tensor of shape [num_directions, batch_size, hidden_size]. When not specified, it's assumed to be a tensor filled with zero.
+            - *initialCellState*: an {{MLOperand}}. The 3-D initial hidden state tensor of shape [num_directions, batch_size, hidden_size]. When not specified, it's assumed to be a tensor filled with zero.
+            - *returnSequence*: a {{boolean}} indicating whether to also return the entire sequence with every output from each time step in it in addition to the output of the last time step. Default to false.
+            - *direction*: a {{MLRecurrentNetworkDirection}}. The processing direction of the input sequence. When set to *"both"*, the size of the first dimension of the weight and the bias tensor shapes must be 2, and the input is processed in both directions.


anssiko · 2023-01-09T11:06:07Z

index.bs

+            - *initialCellState*: an {{MLOperand}}. The 3-D initial hidden state tensor of shape [num_directions, batch_size, hidden_size]. When not specified, it's assumed to be a tensor filled with zero.
+            - *returnSequence*: a {{boolean}} indicating whether to also return the entire sequence with every output from each time step in it in addition to the output of the last time step. Default to false.
+            - *direction*: a {{MLRecurrentNetworkDirection}}. The processing direction of the input sequence. When set to *"both"*, the size of the first dimension of the weight and the bias tensor shapes must be 2, and the input is processed in both directions.
+            - *layout*: a {{MLLstmWeightLayout}}. The ordering of the weight and bias vectors for the internal gates of LSTM, specifically the *input (i)*, *output (o)*, *forget (f)*, and *cell (g)* gate, as indicated in the second dimension of the weight and bias tensor shapes. When not specified, the default layout is *"iofg"*.


anssiko · 2023-01-09T11:06:17Z

index.bs

+            - *bias*: an {{MLOperand}}. The 1-D input bias tensor of shape [4 * hidden_size]. The ordering of the bias vectors in the first dimension of the tensor shape is specified according to the *options.layout* argument.
+            - *recurrentBias*: an {{MLOperand}}. The 1-D recurrent bias tensor of shape [4 * hidden_size]. The ordering of the bias vectors in the first dimension of the tensor shape is specified according to the *options.layout* argument.
+            - *peepholeWeight*: an {{MLOperand}}. The 1-D weight tensor for peepholes of shape [3 * hidden_size]. The pack ordering of the weight vectors is for the *input (i)*, *output (o)*, and *forget (f)* gate respectively.
+            - *layout*: a {{MLLstmWeightLayout}}. The ordering of the weight and bias vectors for the internal gates of LSTM, specifically the *input (i)*, *output (o)*, *forget (f)*, and *cell (g)* gate, as indicated in the first dimension of the weight and bias tensor shapes. When not specified, the default layout is *"iofg"*.


SHA: d87f5d2 Reason: push, by wchao1115 Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

This CL implements a WebNN spec change [1] that, instead of exposing the generic MLOperator interface, exposes the MLActivation interface as a more specific definition of an activation function. Because MLOperator is not exposed to JavaScript anymore, instead of inheriting from ScriptWrappable, it inherits from GarbageCollected. MLOperator is still used internally by MLGraph to represent the graph node that connects with MLOperands. The standalone MLOperator is used by MLActivation to represent the activation function type and keep the additional attributes. The unit tests and WPT baseline are updated according to this interface change. [1]: webmachinelearning/webnn#321 Bug: 1273291 Change-Id: Ia5084a69b6756e54d1c5c05140df4f7844c88bd2 Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/4298931 Reviewed-by: Jiewei Qian <[email protected]> Commit-Queue: ningxin hu <[email protected]> Cr-Commit-Position: refs/heads/main@{#1114320}

wchao1115 added 2 commits December 22, 2022 21:41

Lstm and LstmCell operator

74558fb

Editorial cleanups.

085edb7

wchao1115 requested review from RafaelCintron, anssiko and huningxin December 23, 2022 19:42

huningxin reviewed Dec 28, 2022

View reviewed changes

Update wordings and change steps and hiddenSize to unsigned long …

4ae8440

…type.

huningxin reviewed Jan 3, 2023

View reviewed changes

Fixed unsigned long type for gruCell and lstmCell

4cbb248

huningxin reviewed Jan 3, 2023

View reviewed changes

an {{unsigned long}}

f103955

fdwr reviewed Jan 4, 2023

View reviewed changes

fdwr reviewed Jan 5, 2023

View reviewed changes

fdwr approved these changes Jan 5, 2023

View reviewed changes

Some wording changes.

5aa42b1

fdwr approved these changes Jan 6, 2023

View reviewed changes

huningxin approved these changes Jan 7, 2023

View reviewed changes

anssiko approved these changes Jan 9, 2023

View reviewed changes

"an {{ML"

8545e28

wchao1115 merged commit d87f5d2 into main Jan 9, 2023

wchao1115 deleted the wchao_lstm branch January 9, 2023 21:44

github-actions bot added a commit that referenced this pull request Jan 9, 2023

Add LSTM to the operator list (#321)

f18d033

SHA: d87f5d2 Reason: push, by wchao1115 Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

anssiko mentioned this pull request Jan 12, 2023

Simplify MLContext creation #322

Closed

BruceDai mentioned this pull request Feb 1, 2023

Update interface MLOperator to MLActivation webmachinelearning/webnn-polyfill#202

Closed

Comments

Conversation

wchao1115 commented Dec 23, 2022 • edited by pr-preview bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

huningxin left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wchao1115 commented Jan 2, 2023

Uh oh!

huningxin left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

huningxin left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

fdwr Jan 5, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

huningxin left a comment

Choose a reason for hiding this comment

Uh oh!

anssiko left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

wchao1115 commented Dec 23, 2022 •

edited by pr-preview bot

Loading

fdwr Jan 5, 2023 •

edited

Loading