Skip to content

Conversation

@chilo-ms
Copy link
Contributor

@chilo-ms chilo-ms commented Sep 23, 2025

Description

In current TRT RTX EP/ TRT EP implementation, when constructing the IndexedSubGraph, for some cases, it will include the node's unused output as the SubGraph's output. So, it will return the incorrect IndexedSubGraph from its GetCapability to ORT.
Add the logic to prevent adding the unused node's output.

With this fix, we can avoid generating the incorrect EPContext model where the EPContext node has unused output.

@chilo-ms chilo-ms requested a review from jywu-msft September 23, 2025 18:36
@yuslepukhin
Copy link
Member

I am not seeing a test that demos that this is fixed.

@jywu-msft
Copy link
Member

@chilo-ms add a test in a separate PR. this PR needs to get into rel-1.23.1 as soon as possible.

@chilo-ms
Copy link
Contributor Author

Here is the separate PR to add the unit test.
#26139

Unfortunately, at this point, we don't have a CI to run TRT RTX EP unit tests, will have to run it locally.

@adrianlizarraga adrianlizarraga merged commit 72e56e7 into main Sep 24, 2025
92 checks passed
@adrianlizarraga adrianlizarraga deleted the chi/fix_get_subgraph_for_trt branch September 24, 2025 03:54
adrianlizarraga pushed a commit that referenced this pull request Sep 24, 2025
…lity (#26132)

### Description
In current TRT RTX EP/ TRT EP implementation, when constructing the
`IndexedSubGraph`, for some cases, it will include the node's unused
output as the SubGraph's output. So, it will return the incorrect
`IndexedSubGraph` from its GetCapability to ORT.
Add the logic to prevent adding the unused node's output.

With this fix, we can avoid generating the incorrect EPContext model
where the EPContext node has unused output.
adrianlizarraga pushed a commit that referenced this pull request Sep 24, 2025
…lity (#26132)

### Description
In current TRT RTX EP/ TRT EP implementation, when constructing the
`IndexedSubGraph`, for some cases, it will include the node's unused
output as the SubGraph's output. So, it will return the incorrect
`IndexedSubGraph` from its GetCapability to ORT.
Add the logic to prevent adding the unused node's output.

With this fix, we can avoid generating the incorrect EPContext model
where the EPContext node has unused output.
adrianlizarraga added a commit that referenced this pull request Sep 24, 2025
### Description
Cherry-pick the following PRs into the ORT 1.23.1 branch:

- Fix Attention GQA implementation on CPU
- **MANUAL MERGE**: see
#26057
  - main merge date: Sept 15, 11:33am
  - pr: #25966
  - commit: d530b29
- Address edge GetMemInfo edge cases
  - main merge date: Sept 16, 10:32am
  - pr: #26021
  - commit: d251f3a
- Implement new Python APIs
  - main merge date: Sept 17, 11:44am
  - pr: #25999
  - commit: abc63e8
- MemcpyFromHost and MemcpyToHost support for plugin EPs
- **MERGE CONFLICT** on file
onnxruntime/test/optimizer/transpose_optimizer_test.cc. Conflicts with
#25689
  - main merge date: Sept 23, 10:42am
  - pr: #26088
  - commit: 4545732
- [TRT RTX EP] Fix bug for generating the correct subgraph in
GetCapability #26132
  - main merge date: Sept 23, 8:54pm
  - pr: #26132
  - commit: 72e56e7


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

---------

Co-authored-by: Dmitri Smirnov <[email protected]>
Co-authored-by: Edward Chen <[email protected]>
Co-authored-by: Chi Lo <[email protected]>
@snnn
Copy link
Contributor

snnn commented Sep 25, 2025

This PR has been cherry-picked into the rel-1.23.1 branch in PR #26140. Removing the release:1.23.1 label.

TedThemistokleous added a commit to ROCm/onnxruntime that referenced this pull request Oct 17, 2025
* ORT 1.23.1 cherrypick 1 [REDO] (microsoft#26140)

### Description
Cherry-pick the following PRs into the ORT 1.23.1 branch:

- Fix Attention GQA implementation on CPU
- **MANUAL MERGE**: see
microsoft#26057
  - main merge date: Sept 15, 11:33am
  - pr: microsoft#25966
  - commit: d530b29
- Address edge GetMemInfo edge cases
  - main merge date: Sept 16, 10:32am
  - pr: microsoft#26021
  - commit: d251f3a
- Implement new Python APIs
  - main merge date: Sept 17, 11:44am
  - pr: microsoft#25999
  - commit: abc63e8
- MemcpyFromHost and MemcpyToHost support for plugin EPs
- **MERGE CONFLICT** on file
onnxruntime/test/optimizer/transpose_optimizer_test.cc. Conflicts with
microsoft#25689
  - main merge date: Sept 23, 10:42am
  - pr: microsoft#26088
  - commit: 4545732
- [TRT RTX EP] Fix bug for generating the correct subgraph in
GetCapability microsoft#26132
  - main merge date: Sept 23, 8:54pm
  - pr: microsoft#26132
  - commit: 72e56e7


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

---------

Co-authored-by: Dmitri Smirnov <[email protected]>
Co-authored-by: Edward Chen <[email protected]>
Co-authored-by: Chi Lo <[email protected]>

* ORT 1.23.1 cherrypick 2 (microsoft#26182)

### Description
Adds the following commits to the `rel-1.23.1` branch for ORT 1.23.1:


- add session_id_ to LogEvaluationStart/Stop, LogSessionCreationStart
  - main merge date: July 31, 1:05am
  - pr: microsoft#25590
  - commit: e753643
- [build] fix WebAssembly build on macOS/arm64
  - main merge date: Aug 5, 8:07am
  - pr: microsoft#25653
  - commit: 53f152b
- [CPU] MoE Kernel (microsoft#25958)
  - main merge date: Sept 10, 4:54pm
  - pr: microsoft#25958
  - commit: 930e640
- [CPU] Block-wise QMoE kernel for CPU
  - main merge date: Sept 15, 8:32am
  - pr: microsoft#26009
  - commit: 5d17734
- [C#] Implement missing APIs
  - main merge date: Sept 24, 10:50am
  - pr: microsoft#26101
  - commit: 35dcab5
- Regenerate test model with ONNX IR < 12
  - main merge date: Sept 24, 2:50pm
  - pr: microsoft#26149
  - commit: 88f2652
- [CPU] Fix compilation errors because of unused variables
  - main merge date: Sept 25, 1:21pm
  - pr: microsoft#26147
  - commit: 42fcd71
- [EP ABI] Check if nodes specified in GetCapability() have already been
assigned
  - main merge date: Sept 26, 1:24am
  - pr: microsoft#26156
  - commit: 67d3ba0
- [QNN EP] Add dynamic option to set HTP performance mode
  - main merge date: Sept 26, 11:55am
  - pr: microsoft#26135
  - commit: 6cc40fd

---------

Co-authored-by: xieofxie <[email protected]>
Co-authored-by: hualxie <[email protected]>
Co-authored-by: Yulong Wang <[email protected]>
Co-authored-by: Akshay Sonawane <[email protected]>
Co-authored-by: Dmitri Smirnov <[email protected]>
Co-authored-by: Edward Chen <[email protected]>
Co-authored-by: quic-tirupath <[email protected]>
Co-authored-by: quic-ashwshan <[email protected]>

---------

Co-authored-by: Adrian Lizarraga <[email protected]>
Co-authored-by: Dmitri Smirnov <[email protected]>
Co-authored-by: Edward Chen <[email protected]>
Co-authored-by: Chi Lo <[email protected]>
Co-authored-by: xieofxie <[email protected]>
Co-authored-by: hualxie <[email protected]>
Co-authored-by: Yulong Wang <[email protected]>
Co-authored-by: Akshay Sonawane <[email protected]>
Co-authored-by: quic-tirupath <[email protected]>
Co-authored-by: quic-ashwshan <[email protected]>
fs-eire pushed a commit that referenced this pull request Oct 24, 2025
…lity (#26132)

### Description
In current TRT RTX EP/ TRT EP implementation, when constructing the
`IndexedSubGraph`, for some cases, it will include the node's unused
output as the SubGraph's output. So, it will return the incorrect
`IndexedSubGraph` from its GetCapability to ORT.
Add the logic to prevent adding the unused node's output.

With this fix, we can avoid generating the incorrect EPContext model
where the EPContext node has unused output.
naomiOvad pushed a commit to naomiOvad/onnxruntime that referenced this pull request Nov 2, 2025
…lity (microsoft#26132)

### Description
In current TRT RTX EP/ TRT EP implementation, when constructing the
`IndexedSubGraph`, for some cases, it will include the node's unused
output as the SubGraph's output. So, it will return the incorrect
`IndexedSubGraph` from its GetCapability to ORT.
Add the logic to prevent adding the unused node's output.

With this fix, we can avoid generating the incorrect EPContext model
where the EPContext node has unused output.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants