Skip to content

[QNN EP] Enablement of 64bit Udma mode#26677

Merged
edgchen1 merged 4 commits intomicrosoft:mainfrom
CodeLinaro:dev/qti-monumeen/64bit-udma-mode-enablement
Feb 3, 2026
Merged

[QNN EP] Enablement of 64bit Udma mode#26677
edgchen1 merged 4 commits intomicrosoft:mainfrom
CodeLinaro:dev/qti-monumeen/64bit-udma-mode-enablement

Conversation

@qti-monumeen
Copy link
Contributor

Description

Enabling 64bit udma mode for device architecture v81 or more

Motivation and Context

Support 64bit udma mode to run model efficiently on htp target v81 or above

@qti-monumeen qti-monumeen marked this pull request as ready for review November 28, 2025 10:18
@edgchen1 edgchen1 added the ep:QNN issues related to QNN exeution provider label Dec 1, 2025
@quic-tirupath
Copy link
Contributor

@edgchen1
Could you please review and trigger CI on this PR.

std::filesystem::remove_all(dump_dir);
}

// Test exended UDMA mode on supported hardware (should run successfully)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Test exended UDMA mode on supported hardware (should run successfully)
// Test extended UDMA mode on supported hardware (should run successfully)

how do we know if we are on supported hardware?

@edgchen1
Copy link
Contributor

/azp run Linux QNN CI Pipeline,Windows ARM64 QNN CI Pipeline

@azure-pipelines
Copy link

Azure Pipelines successfully started running 2 pipeline(s).

@yuslepukhin
Copy link
Member

/azp run Linux QNN CI Pipeline, Win_TRT_Minimal_CUDA_Test_CI, Windows ARM64 QNN CI Pipeline, Windows GPU CUDA CI Pipeline, Windows GPU DML CI Pipeline, Windows GPU Doc Gen CI Pipeline, Windows GPU TensorRT CI Pipeline, Windows OpenVINO CI Pipeline, Windows x64 QNN CI Pipeline

@azure-pipelines
Copy link

Azure Pipelines successfully started running 4 pipeline(s).

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR enables 64-bit UDMA (User-space Direct Memory Access) mode for QNN (Qualcomm Neural Network) execution provider on device architecture v81 and above. The feature is designed to improve performance on supported HTP (Hexagon Tensor Processor) hardware by enabling extended UDMA capabilities.

Key changes include:

  • Addition of a new extended_udma provider option that accepts values "0" (disabled) or "1" (enabled), defaulting to disabled
  • Integration of the extended UDMA configuration into the QNN context creation flow
  • Updates to test infrastructure files to support the new option in command-line argument parsing

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
onnxruntime/test/providers/qnn/qnn_basic_test.cc Adds test case for extended UDMA mode functionality with htp_arch v81
onnxruntime/test/perftest/ort_test_session.cc Adds "extended_udma" to QNN provider options list and validation
onnxruntime/test/perftest/command_args_parser.cc Adds documentation for the extended_udma option in help text
onnxruntime/test/onnx/main.cc Adds "extended_udma" option validation and documentation
onnxruntime/test/ep_weight_sharing_ctx_gen/command_args_parser.cc Adds "extended_udma" to context generation tool options
onnxruntime/core/providers/qnn/qnn_execution_provider.h Adds member variable to store extended UDMA mode flag
onnxruntime/core/providers/qnn/qnn_execution_provider.cc Parses extended_udma option and passes it to backend manager
onnxruntime/core/providers/qnn/builder/qnn_backend_manager.h Updates method signatures to accept extended UDMA parameter
onnxruntime/core/providers/qnn/builder/qnn_backend_manager.cc Implements extended UDMA configuration in QNN context creation

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@yuslepukhin
Copy link
Member

Please, respond to the comments so the answers are documented.

@yuslepukhin
Copy link
Member

/azp run Linux QNN CI Pipeline, Win_TRT_Minimal_CUDA_Test_CI,Windows ARM64 QNN CI Pipeline,Windows GPU Doc Gen CI Pipeline

@azure-pipelines
Copy link

Azure Pipelines successfully started running 4 pipeline(s).

@yuslepukhin
Copy link
Member

This branch needs to be rebased (merged) from main.

edgchen1
edgchen1 previously approved these changes Jan 15, 2026
Support 64bit udma model to run model efficiently on HTP v81 or above.
Implement a new QNN option "extended_udma` and propagate it through
context config to HTP.
@minfhong-qti minfhong-qti force-pushed the dev/qti-monumeen/64bit-udma-mode-enablement branch from 54d64dc to 8550225 Compare January 19, 2026 03:31
@tirupath-qti
Copy link
Contributor

@edgchen1 and @yuslepukhin
This PR is missed from 1.24 but we still want to merge in mainline as this is required for enabling GPT-OSS model on QC platforms. This can be picked into any future point release.

Could you please review and trigger CI.

@edgchen1
Copy link
Contributor

/azp run Linux QNN CI Pipeline, Win_TRT_Minimal_CUDA_Test_CI,Windows ARM64 QNN CI Pipeline,Windows GPU Doc Gen CI Pipeline

@azure-pipelines
Copy link

Azure Pipelines successfully started running 4 pipeline(s).

@tirupath-qti
Copy link
Contributor

@edgchen1
seems there is one CI job unrelated to QNN EP failed. Could you please unblock this PR and approve it.
Note: this is needed to enable GPT-OSS QDQ model on QC hardware.

@edgchen1 edgchen1 enabled auto-merge (squash) January 30, 2026 01:08
@edgchen1 edgchen1 merged commit 711d155 into microsoft:main Feb 3, 2026
119 of 126 checks passed
tianleiwu pushed a commit that referenced this pull request Feb 12, 2026
### Description
Enabling 64bit udma mode for device architecture v81 or more



### Motivation and Context
Support 64bit udma mode to run model efficiently on htp target v81 or
above
tianleiwu added a commit that referenced this pull request Feb 13, 2026
This cherry-picks the following commits for the 1.24.2 release:
- #27096
- #27077
- #26677
- #27238
- #27213
- #27256
- #27278
- #27275
- #27276
- #27216
- #27271
- #27299
- #27294
- #27266
- #27176
- #27126
- #27252

---------

Co-authored-by: Xiaofei Han <[email protected]>
Co-authored-by: Jiajia Qin <[email protected]>
Co-authored-by: Yulong Wang <[email protected]>
Co-authored-by: qti-monumeen <[email protected]>
Co-authored-by: Ankit Maheshkar <[email protected]>
Co-authored-by: Eric Crawford <[email protected]>
Co-authored-by: Copilot <[email protected]>
Co-authored-by: guschmue <[email protected]>
Co-authored-by: Guenther Schmuelling <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: angelser <[email protected]>
Co-authored-by: Angela Serrano Brummett <[email protected]>
Co-authored-by: Misha Chornyi <[email protected]>
Co-authored-by: hariharans29 <[email protected]>
Co-authored-by: eserscor <[email protected]>
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Baiju Meswani <[email protected]>
Co-authored-by: Adrian Lizarraga <[email protected]>
Co-authored-by: Ti-Tai Wang <[email protected]>
Co-authored-by: bmehta001 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ep:QNN issues related to QNN exeution provider

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants