Add fusion patterns for conformer-transducer model#18461
Merged
apsonawane merged 6 commits intomainfrom Nov 19, 2023
Merged
Conversation
c58247f to
98dea4d
Compare
98dea4d to
84600a0
Compare
tianleiwu
reviewed
Nov 16, 2023
tianleiwu
reviewed
Nov 16, 2023
tianleiwu
reviewed
Nov 16, 2023
Contributor
|
Pease add a test case for the attention fusion. Otherwise, it is not able to prevent regression in the future. |
tianleiwu
reviewed
Nov 16, 2023
8fc8a31 to
fd8d044
Compare
fd8d044 to
7a730c2
Compare
3fe5d00 to
dda939e
Compare
tianleiwu
reviewed
Nov 17, 2023
tianleiwu
reviewed
Nov 17, 2023
onnxruntime/python/tools/transformers/fusion_conformer_attention.py
Outdated
Show resolved
Hide resolved
tianleiwu
reviewed
Nov 17, 2023
onnxruntime/python/tools/transformers/fusion_conformer_attention.py
Outdated
Show resolved
Hide resolved
tianleiwu
reviewed
Nov 17, 2023
tianleiwu
reviewed
Nov 17, 2023
| from typing import List | ||
|
|
||
| import numpy as np | ||
| import onnx |
Check notice
Code scanning / CodeQL
Module is imported with 'import' and 'import from'
| class ConformerOnnxModel(BertOnnxModel): | ||
| def __init__(self, model, num_heads, hidden_size): | ||
| super().__init__(model, num_heads, hidden_size) | ||
| self.attention_mask = AttentionMask(self) |
Check warning
Code scanning / CodeQL
Overwriting attribute in super-class or sub-class
| def __init__(self, model, num_heads, hidden_size): | ||
| super().__init__(model, num_heads, hidden_size) | ||
| self.attention_mask = AttentionMask(self) | ||
| self.attention_fusion = FusionConformerAttention(self, self.hidden_size, self.num_heads, self.attention_mask) |
Check warning
Code scanning / CodeQL
Overwriting attribute in super-class or sub-class
onnxruntime/python/tools/transformers/fusion_conformer_attention.py
Outdated
Show resolved
Hide resolved
onnxruntime/python/tools/transformers/fusion_conformer_attention.py
Outdated
Show resolved
Hide resolved
onnxruntime/python/tools/transformers/fusion_conformer_attention.py
Outdated
Show resolved
Hide resolved
onnxruntime/python/tools/transformers/fusion_conformer_attention.py
Outdated
Show resolved
Hide resolved
d2ea8e5 to
93e9bda
Compare
93e9bda to
df11324
Compare
kunal-vaishnavi
approved these changes
Nov 19, 2023
kleiti
pushed a commit
to kleiti/onnxruntime
that referenced
this pull request
Mar 22, 2024
### Description Add conformer-transducer model type to optimizer. This PR adds pattern matches for attention shown below: Unfused attention:  Fused attention: 
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Add conformer-transducer model type to optimizer. This PR adds pattern matches for attention shown below:

Unfused attention:
Fused attention:
