[QNN EP] Enablement of 64bit Udma mode#26677
Conversation
|
@edgchen1 |
| std::filesystem::remove_all(dump_dir); | ||
| } | ||
|
|
||
| // Test exended UDMA mode on supported hardware (should run successfully) |
There was a problem hiding this comment.
| // Test exended UDMA mode on supported hardware (should run successfully) | |
| // Test extended UDMA mode on supported hardware (should run successfully) |
how do we know if we are on supported hardware?
|
/azp run Linux QNN CI Pipeline,Windows ARM64 QNN CI Pipeline |
|
Azure Pipelines successfully started running 2 pipeline(s). |
|
/azp run Linux QNN CI Pipeline, Win_TRT_Minimal_CUDA_Test_CI, Windows ARM64 QNN CI Pipeline, Windows GPU CUDA CI Pipeline, Windows GPU DML CI Pipeline, Windows GPU Doc Gen CI Pipeline, Windows GPU TensorRT CI Pipeline, Windows OpenVINO CI Pipeline, Windows x64 QNN CI Pipeline |
|
Azure Pipelines successfully started running 4 pipeline(s). |
There was a problem hiding this comment.
Pull request overview
This PR enables 64-bit UDMA (User-space Direct Memory Access) mode for QNN (Qualcomm Neural Network) execution provider on device architecture v81 and above. The feature is designed to improve performance on supported HTP (Hexagon Tensor Processor) hardware by enabling extended UDMA capabilities.
Key changes include:
- Addition of a new
extended_udmaprovider option that accepts values "0" (disabled) or "1" (enabled), defaulting to disabled - Integration of the extended UDMA configuration into the QNN context creation flow
- Updates to test infrastructure files to support the new option in command-line argument parsing
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| onnxruntime/test/providers/qnn/qnn_basic_test.cc | Adds test case for extended UDMA mode functionality with htp_arch v81 |
| onnxruntime/test/perftest/ort_test_session.cc | Adds "extended_udma" to QNN provider options list and validation |
| onnxruntime/test/perftest/command_args_parser.cc | Adds documentation for the extended_udma option in help text |
| onnxruntime/test/onnx/main.cc | Adds "extended_udma" option validation and documentation |
| onnxruntime/test/ep_weight_sharing_ctx_gen/command_args_parser.cc | Adds "extended_udma" to context generation tool options |
| onnxruntime/core/providers/qnn/qnn_execution_provider.h | Adds member variable to store extended UDMA mode flag |
| onnxruntime/core/providers/qnn/qnn_execution_provider.cc | Parses extended_udma option and passes it to backend manager |
| onnxruntime/core/providers/qnn/builder/qnn_backend_manager.h | Updates method signatures to accept extended UDMA parameter |
| onnxruntime/core/providers/qnn/builder/qnn_backend_manager.cc | Implements extended UDMA configuration in QNN context creation |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
Please, respond to the comments so the answers are documented. |
|
/azp run Linux QNN CI Pipeline, Win_TRT_Minimal_CUDA_Test_CI,Windows ARM64 QNN CI Pipeline,Windows GPU Doc Gen CI Pipeline |
|
Azure Pipelines successfully started running 4 pipeline(s). |
|
This branch needs to be rebased (merged) from main. |
Support 64bit udma model to run model efficiently on HTP v81 or above. Implement a new QNN option "extended_udma` and propagate it through context config to HTP.
54d64dc to
8550225
Compare
|
@edgchen1 and @yuslepukhin Could you please review and trigger CI. |
|
/azp run Linux QNN CI Pipeline, Win_TRT_Minimal_CUDA_Test_CI,Windows ARM64 QNN CI Pipeline,Windows GPU Doc Gen CI Pipeline |
|
Azure Pipelines successfully started running 4 pipeline(s). |
|
@edgchen1 |
### Description Enabling 64bit udma mode for device architecture v81 or more ### Motivation and Context Support 64bit udma mode to run model efficiently on htp target v81 or above
This cherry-picks the following commits for the 1.24.2 release: - #27096 - #27077 - #26677 - #27238 - #27213 - #27256 - #27278 - #27275 - #27276 - #27216 - #27271 - #27299 - #27294 - #27266 - #27176 - #27126 - #27252 --------- Co-authored-by: Xiaofei Han <[email protected]> Co-authored-by: Jiajia Qin <[email protected]> Co-authored-by: Yulong Wang <[email protected]> Co-authored-by: qti-monumeen <[email protected]> Co-authored-by: Ankit Maheshkar <[email protected]> Co-authored-by: Eric Crawford <[email protected]> Co-authored-by: Copilot <[email protected]> Co-authored-by: guschmue <[email protected]> Co-authored-by: Guenther Schmuelling <[email protected]> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: angelser <[email protected]> Co-authored-by: Angela Serrano Brummett <[email protected]> Co-authored-by: Misha Chornyi <[email protected]> Co-authored-by: hariharans29 <[email protected]> Co-authored-by: eserscor <[email protected]> Co-authored-by: Copilot <[email protected]> Co-authored-by: Baiju Meswani <[email protected]> Co-authored-by: Adrian Lizarraga <[email protected]> Co-authored-by: Ti-Tai Wang <[email protected]> Co-authored-by: bmehta001 <[email protected]>
Description
Enabling 64bit udma mode for device architecture v81 or more
Motivation and Context
Support 64bit udma mode to run model efficiently on htp target v81 or above