Skip to content

Conversation

@yf711
Copy link
Contributor

@yf711 yf711 commented Nov 14, 2024

…n TRT (#22681)

Add new provider option `trt_op_types_to_exclude`:
- User can provide op type list to be excluded from running on TRT
- e.g. `trt_op_types_to_exclude="MaxPool"`

There is a known performance issue with the DDS ops (NonMaxSuppression,
NonZero and RoiAlign) from TRT versions 10.0 to 10.7. TRT EP excludes
DDS ops from running on TRT by default, user can override default value
with empty string to include all ops.
@jywu-msft jywu-msft requested a review from chilo-ms November 15, 2024 19:45
@chilo-ms
Copy link
Contributor

chilo-ms commented Nov 16, 2024

Could we consider cherry pick this PR?
#22863

Update: We reverted TRT EP's change, so this PR is no longer needed.

liqunfu and others added 3 commits November 17, 2024 23:58
Signed-off-by: Liqun Fu <[email protected]>
Signed-off-by: Liqun Fu <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
TRT EP excludes DDS ops from running on TRT and doesn't allow users to
change.
This PR is for ORT 1.20.1 patch release.

We will have better solution to add a new provider option to exclude
specific ops, similar to following:
#22863
#22681
### Description
- Updates pipelines to use QNN SDK 2.28.2.241116.
- Re-enable LayerNormalization unit tests that failed with accuracy
errors with the previous QNN SDK (2.28.0).
- Update QNN EP to no longer provide a dummy bias for LayerNorm if the
QNN SDK version is >= 2.28.0.


### Motivation and Context
Use the latest QNN SDK. This version improves inference latency for
certain customer models.
@yf711 yf711 requested a review from a team November 19, 2024 04:11
@yf711 yf711 merged commit 5c1b7cc into rel-1.20.1 Nov 19, 2024
238 of 245 checks passed
@yf711 yf711 deleted the yifanl/round-2-cherry-pick-rel-1.20.1 branch November 19, 2024 21:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants