fix: setting builder optimization level to TRT 8.6 default#15897
fix: setting builder optimization level to TRT 8.6 default#15897chilo-ms merged 3 commits intomicrosoft:mainfrom
Conversation
|
/azp run Linux CPU CI Pipeline, Linux CPU Minimal Build E2E CI Pipeline, Linux GPU CI Pipeline, Linux GPU TensorRT CI Pipeline, Linux Nuphar CI Pipeline, Linux OpenVINO CI Pipeline, MacOS CI Pipeline, ONNX Runtime Web CI Pipeline, Windows CPU CI Pipeline, Windows GPU CI Pipeline |
|
/azp run Windows GPU TensorRT CI Pipeline, onnxruntime-binary-size-checks-ci-pipeline, onnxruntime-python-checks-ci-pipeline, orttraining-linux-ci-pipeline, orttraining-linux-gpu-ci-pipeline, orttraining-ortmodule-distributed |
|
Azure Pipelines successfully started running 5 pipeline(s). |
|
Azure Pipelines successfully started running 9 pipeline(s). |
|
/azp run Linux QNN CI Pipeline, Windows ARM64 QNN CI Pipeline |
|
Azure Pipelines successfully started running 2 pipeline(s). |
|
Could you help merge the main to pass "Linux GPU TensorRT CI Pipeline"? |
|
/azp run Linux CPU CI Pipeline, Linux CPU Minimal Build E2E CI Pipeline, Linux GPU CI Pipeline, Linux GPU TensorRT CI Pipeline, Linux Nuphar CI Pipeline, Linux OpenVINO CI Pipeline, MacOS CI Pipeline, ONNX Runtime Web CI Pipeline, Windows CPU CI Pipeline, Windows GPU CI Pipeline |
|
/azp run Windows GPU TensorRT CI Pipeline, onnxruntime-binary-size-checks-ci-pipeline, onnxruntime-python-checks-ci-pipeline, orttraining-linux-ci-pipeline, orttraining-linux-gpu-ci-pipeline, orttraining-ortmodule-distributed |
|
/azp run Linux QNN CI Pipeline, Windows ARM64 QNN CI Pipeline |
|
Azure Pipelines successfully started running 9 pipeline(s). |
|
Azure Pipelines successfully started running 5 pipeline(s). |
|
Azure Pipelines successfully started running 2 pipeline(s). |
There was a problem hiding this comment.
line 2831 & 2892 need to be modified as well.
if (trt_state->builder_optimization_level != 2) {
There was a problem hiding this comment.
we'll have to kick off the CI again after those lines are updated.
There was a problem hiding this comment.
we really need to simplify option updates, too error prone currently due to the need to change too many places in code.
There was a problem hiding this comment.
Done! Is this something your team is working on right now ?
There was a problem hiding this comment.
it's on our radar and we would like to dedicate some time/resources on it.
|
/azp run Linux CPU CI Pipeline, Linux CPU Minimal Build E2E CI Pipeline, Linux GPU CI Pipeline, Linux GPU TensorRT CI Pipeline, Linux Nuphar CI Pipeline, Linux OpenVINO CI Pipeline, MacOS CI Pipeline, ONNX Runtime Web CI Pipeline, Windows CPU CI Pipeline, Windows GPU CI Pipeline |
|
/azp run Windows GPU TensorRT CI Pipeline, onnxruntime-binary-size-checks-ci-pipeline, onnxruntime-python-checks-ci-pipeline, orttraining-linux-ci-pipeline, orttraining-linux-gpu-ci-pipeline, orttraining-ortmodule-distributed |
|
/azp run Linux QNN CI Pipeline, Windows ARM64 QNN CI Pipeline |
|
Azure Pipelines successfully started running 5 pipeline(s). |
|
Azure Pipelines successfully started running 9 pipeline(s). |
|
Azure Pipelines successfully started running 2 pipeline(s). |
The actual released default level is 3 and not the previously used 2. Just a small sample of the effects: 
The actual released default level is 3 and not the previously used 2. Just a small sample of the effects: 
The actual released default level is 3 and not the previously used 2. Just a small sample of the effects: 
### Description Cherry-picks 26 commits to the release branch. Most cherry-picks are clean merges. Except: 1. When I got conflicts in cgmanifest.json and download-deps.yml, I choose to ignore the conflicts and regenerate the two files 2. There were some conflicts in cmake/deps.txt, onnxruntime_c_api.cc PR list: [js/webgpu] fix Transpose with non-float tensor (#15819) [js/web] fix terser reserved symbols for worker (#15864) [JSEP] fix constructor for OrtDevice (#15805) Bump engine.io from 6.4.1 to 6.4.2 in /js/web (#15799) Bump engine.io from 6.4.0 to 6.4.2 in /onnxruntime/test/wasm (#15798) [wasm] revert emsdk to v3.1.19 (#15793) [wasm/JSEP] add threaded build to artifacts (#15777) [js/web] add target ort.webgpu.min.js (#15780) update ort extensions to 94142d8391c9791ec71c38336436319a2d4ac7a0 (#15688) fix: setting builder optimization level to TRT 8.6 default (#15897) Adust GetVersionString() GetBuildInfoString() signatures and move them to OrtApi (#15921) Fix segfault for multiple GPU run (regression) (#15823) android package fix (#15999) [CoreML EP] Minor changes to allow CoreML EP to handle more nodes and models. (#15993) Adding support for conv fp16 fusion on Resnet50v1 (#15474) update onnx release 1.14 for docker files (#15680) Avoid generating training documentation during packaging (#15795) Update Conv-Add-Relu Fusion Transformation (#15834) Fix symbolic shape infer empty value_info (#15842) NhwcFusedConv: Add before Activation (#15837) use __hmul2 instead of __hmul2_rn (#15852) change the EP device to default OrtDevice() for memoryType equals CPU Input (#15903) Fixing NhwcFusedConv fp16 (#15950) fix topo sort in quantization tool (#16003) [doc] add LeakyRelu to coreml supported ops (#15944) [DML EP] Add frequent upload heap flushing (#15960) Co-authored-by: Yulong Wang Co-authored-by: dependabot[bot] Co-authored-by: Guenther Schmuelling Co-authored-by: Shalva Mist Co-authored-by: Maximilian Müller Co-authored-by: Dmitri Smirnov Co-authored-by: pengwa Co-authored-by: Ashwini Khade Co-authored-by: Edward Chen Co-authored-by: Jian Chen Co-authored-by: liqun Fu Co-authored-by: Baiju Meswani Co-authored-by: Tianlei Wu Co-authored-by: Chen Fu Co-authored-by: Ye Wang Co-authored-by: cao lei Co-authored-by: Yufeng Li Co-authored-by: Rachel Guo Co-authored-by: Patrice Vignola
### Description Cherry-picks 26 commits to the release branch. Most cherry-picks are clean merges. Except: 1. When I got conflicts in cgmanifest.json and download-deps.yml, I choose to ignore the conflicts and regenerate the two files 2. There were some conflicts in cmake/deps.txt, onnxruntime_c_api.cc PR list: [js/webgpu] fix Transpose with non-float tensor (microsoft#15819) [js/web] fix terser reserved symbols for worker (microsoft#15864) [JSEP] fix constructor for OrtDevice (microsoft#15805) Bump engine.io from 6.4.1 to 6.4.2 in /js/web (microsoft#15799) Bump engine.io from 6.4.0 to 6.4.2 in /onnxruntime/test/wasm (microsoft#15798) [wasm] revert emsdk to v3.1.19 (microsoft#15793) [wasm/JSEP] add threaded build to artifacts (microsoft#15777) [js/web] add target ort.webgpu.min.js (microsoft#15780) update ort extensions to 94142d8391c9791ec71c38336436319a2d4ac7a0 (microsoft#15688) fix: setting builder optimization level to TRT 8.6 default (microsoft#15897) Adust GetVersionString() GetBuildInfoString() signatures and move them to OrtApi (microsoft#15921) Fix segfault for multiple GPU run (regression) (microsoft#15823) android package fix (microsoft#15999) [CoreML EP] Minor changes to allow CoreML EP to handle more nodes and models. (microsoft#15993) Adding support for conv fp16 fusion on Resnet50v1 (microsoft#15474) update onnx release 1.14 for docker files (microsoft#15680) Avoid generating training documentation during packaging (microsoft#15795) Update Conv-Add-Relu Fusion Transformation (microsoft#15834) Fix symbolic shape infer empty value_info (microsoft#15842) NhwcFusedConv: Add before Activation (microsoft#15837) use __hmul2 instead of __hmul2_rn (microsoft#15852) change the EP device to default OrtDevice() for memoryType equals CPU Input (microsoft#15903) Fixing NhwcFusedConv fp16 (microsoft#15950) fix topo sort in quantization tool (microsoft#16003) [doc] add LeakyRelu to coreml supported ops (microsoft#15944) [DML EP] Add frequent upload heap flushing (microsoft#15960) Co-authored-by: Yulong Wang Co-authored-by: dependabot[bot] Co-authored-by: Guenther Schmuelling Co-authored-by: Shalva Mist Co-authored-by: Maximilian Müller Co-authored-by: Dmitri Smirnov Co-authored-by: pengwa Co-authored-by: Ashwini Khade Co-authored-by: Edward Chen Co-authored-by: Jian Chen Co-authored-by: liqun Fu Co-authored-by: Baiju Meswani Co-authored-by: Tianlei Wu Co-authored-by: Chen Fu Co-authored-by: Ye Wang Co-authored-by: cao lei Co-authored-by: Yufeng Li Co-authored-by: Rachel Guo Co-authored-by: Patrice Vignola
The actual released default level is 3 and not the previously used 2.
Just a small sample of the effects:
