Skip to content

Conversation

@gedoensmax
Copy link
Contributor

@gedoensmax gedoensmax commented Aug 5, 2025

This currently holds 2 major improvements:

  • dynamic shape models should have much lower memory usage and in addition to that the management is move towards ORT allocators
  • the overhead for shape binding and address updates is reduce per inference

gedoensmax and others added 5 commits August 1, 2025 13:42
@chilo-ms
Copy link
Contributor

chilo-ms commented Aug 5, 2025

/azp run Linux QNN CI Pipeline,Win_TRT_Minimal_CUDA_Test_CI,Windows ARM64 QNN CI Pipeline,Windows GPU Doc Gen CI Pipeline,Windows x64 QNN CI Pipeline

@azure-pipelines
Copy link

Azure Pipelines successfully started running 5 pipeline(s).

@chilo-ms chilo-ms added ep:NvRTX NV RTX execution provider release:1.23.0 labels Aug 5, 2025
Copy link
Contributor

@skottmckay skottmckay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@chilo-ms chilo-ms merged commit d912167 into microsoft:main Aug 6, 2025
91 of 92 checks passed
adrianlizarraga pushed a commit that referenced this pull request Aug 8, 2025
This currently holds 2 major improvements:
- dynamic shape models should have much lower memory usage and in
addition to that the management is move towards ORT allocators
- the overhead for shape binding and address updates is reduce per
inference

---------

Co-authored-by: Gaurav Garg <[email protected]>
adrianlizarraga added a commit that referenced this pull request Aug 8, 2025
…5, 25652 (#25701)

### Description
Cherry-pick the following PRs into the `rel-1.23.0` branch:

- #25391
- #25611
- #25656
- #25346
- #25374
- #25664
- #25675
- #25652


### Motivation and Context
<!-- - Why is this change required? What problem does it solve?
- If it fixes an open issue, please link to the issue here. -->

---------

Co-authored-by: Yulong Wang <[email protected]>
Co-authored-by: Ishwar Raut <[email protected]>
Co-authored-by: Maximilian Müller <[email protected]>
Co-authored-by: Gaurav Garg <[email protected]>
Co-authored-by: Scott McKay <[email protected]>
Co-authored-by: Chi Lo <[email protected]>
Co-authored-by: Abhishek Jindal <[email protected]>
Co-authored-by: Dmitri Smirnov <[email protected]>
sanketkaleoss pushed a commit to sanketkaleoss/onnxruntime that referenced this pull request Aug 11, 2025
This currently holds 2 major improvements:
- dynamic shape models should have much lower memory usage and in
addition to that the management is move towards ORT allocators
- the overhead for shape binding and address updates is reduce per
inference

---------

Co-authored-by: Gaurav Garg <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ep:NvRTX NV RTX execution provider

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants