Skip to content

Comments

Validate the hidden size of GRU and LSTM operators#644

Merged
fdwr merged 10 commits intowebmachinelearning:mainfrom
inexorabletash:validate-hidden-size-multiples
May 1, 2024
Merged

Validate the hidden size of GRU and LSTM operators#644
fdwr merged 10 commits intowebmachinelearning:mainfrom
inexorabletash:validate-hidden-size-multiples

Conversation

@inexorabletash
Copy link
Contributor

@inexorabletash inexorabletash commented Apr 17, 2024

For gru()/gruCell() and lstm()/lstmCell(), a hiddenSize parameter is passed, a multiple of which defines a dimension of the output.

  • This is sometimes implicitly validated by the presence of an option with the same dimension - but not always.

  • Some underlying platforms operate on a single bias tensor, rather than the two bias/recurrentBias options present in the WebNN API. So the combined size needs to also be a valid dimension.

Introduce validation for all cases, validate the combined size, and add an explanation inline since this is subtle.

Fixes #625


Preview | Diff

For gru()/gruCell() and lstm()/lstmCell(), a hiddenSize parameter is
passed, a multiple of which defines a dimension of the output.

- This is sometimes implicitly validated by the presence of an option
  with the same dimension - but not always.

- Some underlying platforms operate on a single bias tensor, rather
  than the two bias/recurrentBias options present in the WebNN API. So
  the combined size needs to also be a valid dimension.

Introduce validation for all cases, validate the combined size, and
add an explanation inline since this is subtle.

Fixes #625
@inexorabletash inexorabletash changed the title Validate the maximum hidden size of GRU and LSTM operators Validate the hidden size of GRU and LSTM operators Apr 17, 2024
Copy link
Contributor

@huningxin huningxin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@huningxin huningxin requested a review from fdwr April 19, 2024 07:22
@huningxin
Copy link
Contributor

/cc @shiyi9801 @miaobin

@zolkis
Copy link
Collaborator

zolkis commented Apr 24, 2024

Special thanks for the clarifying notes.

Copy link
Contributor

@huningxin huningxin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Collaborator

@fdwr fdwr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@fdwr fdwr merged commit ecc55a3 into webmachinelearning:main May 1, 2024
github-actions bot added a commit that referenced this pull request May 1, 2024
SHA: ecc55a3
Reason: push, by fdwr

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
@inexorabletash inexorabletash deleted the validate-hidden-size-multiples branch May 1, 2024 17:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Need clarify the maximum limit of the hidden size of the Gru operator

4 participants