MT5 onnx conversion for beam search#11958
Merged
Conversation
|
This pull request introduces 1 alert when merging fe6605d into e24349b - view on LGTM.com new alerts:
|
wangyems
reviewed
Jun 23, 2022
| @@ -217,6 +228,7 @@ def export_onnx_models( | |||
|
|
|||
| def main(): | |||
Contributor
There was a problem hiding this comment.
maybe comment/assert somewhere that onnx>=1.12 is needed for subgraph proto > 2G case?
Contributor
Author
There was a problem hiding this comment.
Sure. Let me add a check of ONNX version in next PR.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description:
Support mT5 model to onnx conversion for beam search like:
The output will have two files (like mt5-large-beamsearch.onnx and mt5-large-beamsearch.onnx.data) when external data format (-e) is used.
Note: please install ONNX 1.12 package for this. Otherwise, you might encounter error like 'Message onnx.ModelProto exceeds maximum protobuf size of 2GB:' during saving the output model.
Some intermediate encoder or decoder onnx models can be found in ./google, and those files are not needed (They are preserved for debugging purpose).
Right now the model can run, but max diff could be like 8e-3 for encoder or decoder, which could cause beam search result difference between PyTorch and ORT. That's a separate issue for pytorch exporter.
Motivation and Context
Related issues: #11813, #11848