Conversation
huningxin
left a comment
There was a problem hiding this comment.
@wchao1115 , thanks for your great work. lstm is an important rnn network, webnn should support it.
Sorry for the late reply, I was out sick for a few days.
index.bs
Outdated
| - *weight*: an {{MLOperand}}. The 2-D input weight tensor of shape [3 * hidden_size, input_size]. The ordering of the weight vectors in the first dimension of the tensor shape is specified according to the *options.layout* argument. | ||
| - *recurrentWeight*: an {{MLOperand}}. The 2-D recurrent weight tensor of shape [3 * hidden_size, hidden_size]. The ordering of the weight vectors in the first dimension of the tensor shape is specified according to the *options.layout* argument. | ||
| - *hiddenState*: an {{MLOperand}}. The 2-D input hidden state tensor of shape [batch_size, hidden_size]. | ||
| - *hiddenSize*: a {{long}} scalar. The value of the second dimension of the output tensor shape. It indicates the number of features in the hidden state. |
There was a problem hiding this comment.
Should hiddenSize be an unsigned long scalar?
There was a problem hiding this comment.
You may want to change hiddenSize of gruCell to unsigned long as well.
|
New commit should address all the feedbacks here. Please take a look. |
huningxin
left a comment
There was a problem hiding this comment.
Thanks for the update, you may may miss to change two hiddenSize to unsigned long scalar. Others look good to me.
index.bs
Outdated
| - *weight*: an {{MLOperand}}. The 2-D input weight tensor of shape [3 * hidden_size, input_size]. The ordering of the weight vectors in the first dimension of the tensor shape is specified according to the *options.layout* argument. | ||
| - *recurrentWeight*: an {{MLOperand}}. The 2-D recurrent weight tensor of shape [3 * hidden_size, hidden_size]. The ordering of the weight vectors in the first dimension of the tensor shape is specified according to the *options.layout* argument. | ||
| - *hiddenState*: an {{MLOperand}}. The 2-D input hidden state tensor of shape [batch_size, hidden_size]. | ||
| - *hiddenSize*: a {{long}} scalar. The value of the second dimension of the output tensor shape. It indicates the number of features in the hidden state. |
There was a problem hiding this comment.
You may want to change hiddenSize of gruCell to unsigned long as well.
index.bs
Outdated
| - *recurrentWeight*: an {{MLOperand}}. The 2-D recurrent weight tensor of shape [4 * hidden_size, hidden_size]. The ordering of the weight vectors in the first dimension of the tensor shape is specified according to the *options.layout* argument. | ||
| - *hiddenState*: an {{MLOperand}}. The 2-D input hidden state tensor of shape [batch_size, hidden_size]. | ||
| - *cellState*: an {{MLOperand}}. The 2-D input cell state tensor of shape [batch_size, hidden_size]. | ||
| - *hiddenSize*: a {{long}} scalar. The value of the second dimension of the output tensor shape. It indicates the number of features in the hidden state. |
huningxin
left a comment
There was a problem hiding this comment.
Looks good with a few nits that I missed in the last review. Sorry and please take a look.
index.bs
Outdated
|
|
||
| ### The gru() method ### {#api-mlgraphbuilder-gru} | ||
| Gated Recurrent Unit [[GRU]] recurrent network using an update gate and a reset gate to compute the hidden state that rolls into the output across the temporal sequence of the Network | ||
| Gated Recurrent Unit [[GRU]] recurrent network uses an update, reset, and new gate to compute the output state that rolls into the output across the temporal sequence of the network |
| @@ -1411,33 +1411,34 @@ dictionary MLGruOptions { | |||
| boolean resetAfter = true; | |||
| boolean returnSequence = false; | |||
| MLRecurrentNetworkDirection direction = "forward"; | |||
There was a problem hiding this comment.
Since MLRecurrentNetworkDirection is also used by MLLstmOptions too, should this shared definition be outside the gru-specific section?
There was a problem hiding this comment.
We don't dedicate a section for an enum. Since gru comes first, I just parked this enum there.
index.bs
Outdated
|
|
||
| <div class="note"> | ||
| The behavior of this operation when the activations of the update/reset gate and new gate are of the operator types *sigmoid* and *tanh* respectively can be generically emulated from the usage of other operations as follow. However, user agents typically have a more efficient implementation for it, therefore its usage is encouraged from the performance standpoint. | ||
| The behavior of this operation when the weight layout is the default *"zrn"* layout, and the activation functions of the update/reset gate and new gate are of the operator types *sigmoid* and *tanh* respectively can be generically emulated from the usage of other operations as follow. However, user agents typically have a more efficient implementation for it, therefore its usage is encouraged from the performance standpoint. |
There was a problem hiding this comment.
"as follow" -> "as follows"
Wording is awkward because it piles on many intermediate phrases before the final verb clause. How about:
"The behavior of this operation can be generically emulated via other operations as shown below, when the weight layout is the default "zrn" layout, and the activation functions of the update/reset gate and new gate are of the operator types sigmoid and tanh respectively."
| <div algorithm=lstm> | ||
| **Arguments:** | ||
| - *input*: an {{MLOperand}}. The input 3-D tensor of shape [steps, batch_size, input_size]. | ||
| - *weight*: an {{MLOperand}}. The 3-D input weight tensor of shape [num_directions, 4 * hidden_size, input_size]. The ordering of the weight vectors in the second dimension of the tensor shape is specified according to the *options.layout* argument. |
There was a problem hiding this comment.
Naming inconsistencies:
hidden_size (line 1873) vs hiddenSize (line 1866)
num_directions vs numDirections (line 1893)
There was a problem hiding this comment.
Thanks for pointing this out. But I'm not going to address this issue in this PR. It probably needs a round of cleanup.
huningxin
left a comment
There was a problem hiding this comment.
LGTM, thanks for the great work!
anssiko
left a comment
There was a problem hiding this comment.
@wchao1115 thanks for adding these important ops. I'm clearing my review with a few grammar nits only since this PR has been already reviewed quite extensively.
I'll put this PR on our next call agenda and expect we can land this soon for timely inclusion into the CR scope. We explicitly mention LSTM alongside RNN in our current charter, so good to get this in from that point of view too.
index.bs
Outdated
| - *direction*: a {{MLRecurrentNetworkDirection}}. The processing direction of the input sequence. When set to *"both"*, the size of the first dimension of the weight and the bias tensor shapes must be 2, and the input is processed in both directions. | ||
| - *layout*: a {{MLRecurrentNetworkWeightLayout}}. The ordering of the weight and bias vectors for the internal gates of GRU, specifically the *update (z)*, *reset (r)*, and *new (n)* gate, as indicated in the second dimension of the weight and bias tensor shape. When not specified, the default layout is *"zrn"*. | ||
| - *activations*: a sequence of {{MLOperator}}. A pair of activation functions with the first function used for the update and reset gate, and the second used for the new gate. When not specified, it's assumed to be the sigmoid (*"sigmoid"*) and the hyperbolic tangent (*"tanh"*) function respectively. | ||
| - *layout*: a {{MLGruWeightLayout}}. The ordering of the weight and bias vectors for the internal gates of GRU, specifically the *update (z)*, *reset (r)*, and *new (n)* gate, as indicated in the second dimension of the weight and bias tensor shape. When not specified, the default layout is *"zrn"*. |
index.bs
Outdated
| - *initialHiddenState*: an {{MLOperand}}. The 3-D initial hidden state tensor of shape [num_directions, batch_size, hidden_size]. When not specified, it's assumed to be a tensor filled with zero. | ||
| - *initialCellState*: an {{MLOperand}}. The 3-D initial hidden state tensor of shape [num_directions, batch_size, hidden_size]. When not specified, it's assumed to be a tensor filled with zero. | ||
| - *returnSequence*: a {{boolean}} indicating whether to also return the entire sequence with every output from each time step in it in addition to the output of the last time step. Default to false. | ||
| - *direction*: a {{MLRecurrentNetworkDirection}}. The processing direction of the input sequence. When set to *"both"*, the size of the first dimension of the weight and the bias tensor shapes must be 2, and the input is processed in both directions. |
index.bs
Outdated
| - *initialCellState*: an {{MLOperand}}. The 3-D initial hidden state tensor of shape [num_directions, batch_size, hidden_size]. When not specified, it's assumed to be a tensor filled with zero. | ||
| - *returnSequence*: a {{boolean}} indicating whether to also return the entire sequence with every output from each time step in it in addition to the output of the last time step. Default to false. | ||
| - *direction*: a {{MLRecurrentNetworkDirection}}. The processing direction of the input sequence. When set to *"both"*, the size of the first dimension of the weight and the bias tensor shapes must be 2, and the input is processed in both directions. | ||
| - *layout*: a {{MLLstmWeightLayout}}. The ordering of the weight and bias vectors for the internal gates of LSTM, specifically the *input (i)*, *output (o)*, *forget (f)*, and *cell (g)* gate, as indicated in the second dimension of the weight and bias tensor shapes. When not specified, the default layout is *"iofg"*. |
index.bs
Outdated
| - *bias*: an {{MLOperand}}. The 1-D input bias tensor of shape [4 * hidden_size]. The ordering of the bias vectors in the first dimension of the tensor shape is specified according to the *options.layout* argument. | ||
| - *recurrentBias*: an {{MLOperand}}. The 1-D recurrent bias tensor of shape [4 * hidden_size]. The ordering of the bias vectors in the first dimension of the tensor shape is specified according to the *options.layout* argument. | ||
| - *peepholeWeight*: an {{MLOperand}}. The 1-D weight tensor for peepholes of shape [3 * hidden_size]. The pack ordering of the weight vectors is for the *input (i)*, *output (o)*, and *forget (f)* gate respectively. | ||
| - *layout*: a {{MLLstmWeightLayout}}. The ordering of the weight and bias vectors for the internal gates of LSTM, specifically the *input (i)*, *output (o)*, *forget (f)*, and *cell (g)* gate, as indicated in the first dimension of the weight and bias tensor shapes. When not specified, the default layout is *"iofg"*. |
SHA: d87f5d2 Reason: push, by wchao1115 Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
This CL implements a WebNN spec change [1] that, instead of exposing the generic MLOperator interface, exposes the MLActivation interface as a more specific definition of an activation function. Because MLOperator is not exposed to JavaScript anymore, instead of inheriting from ScriptWrappable, it inherits from GarbageCollected. MLOperator is still used internally by MLGraph to represent the graph node that connects with MLOperands. The standalone MLOperator is used by MLActivation to represent the activation function type and keep the additional attributes. The unit tests and WPT baseline are updated according to this interface change. [1]: webmachinelearning/webnn#321 Bug: 1273291 Change-Id: Ia5084a69b6756e54d1c5c05140df4f7844c88bd2 Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/4298931 Reviewed-by: Jiewei Qian <[email protected]> Commit-Queue: ningxin hu <[email protected]> Cr-Commit-Position: refs/heads/main@{#1114320}
Finally had a chance to do this. Adding
lstmandlstmCellas additional operations in the spec. Since the addition ofgruandgruCellthere have been numerous questions about LSTM. As a widely popular recurrent network, supporting LSTM along with GRU is generally seen as important for API completeness.This change also made some editorial cleanups of existing wordings for GRU, and proposes a name change of the existing
MLOperatortoMLActivationto reflect its true purpose in the spec as a generic definition of an activation function i.e. not every operation can be an activation function, etc.Preview | Diff