NhwcFusedConv: Add before Activation#15837
Merged
chenfucn merged 2 commits intomicrosoft:mainfrom May 9, 2023
Merged
Conversation
4adfaf8 to
40921a8
Compare
40921a8 to
49bcd76
Compare
jchen351
approved these changes
May 7, 2023
yufenglee
approved these changes
May 9, 2023
prathikr
pushed a commit
that referenced
this pull request
May 16, 2023
### Description Fp16 FusedConv and NhwcFusedConv. Fused Add operator should be performed BEFORE the activation operator. ### Motivation and Context Previous understanding of fused conv is incorrect.
snnn
pushed a commit
that referenced
this pull request
May 19, 2023
### Description Fp16 FusedConv and NhwcFusedConv. Fused Add operator should be performed BEFORE the activation operator. ### Motivation and Context Previous understanding of fused conv is incorrect.
snnn
pushed a commit
that referenced
this pull request
May 19, 2023
### Description Fp16 FusedConv and NhwcFusedConv. Fused Add operator should be performed BEFORE the activation operator. ### Motivation and Context Previous understanding of fused conv is incorrect.
snnn
pushed a commit
that referenced
this pull request
May 19, 2023
### Description Cherry-picks 26 commits to the release branch. Most cherry-picks are clean merges. Except: 1. When I got conflicts in cgmanifest.json and download-deps.yml, I choose to ignore the conflicts and regenerate the two files 2. There were some conflicts in cmake/deps.txt, onnxruntime_c_api.cc PR list: [js/webgpu] fix Transpose with non-float tensor (#15819) [js/web] fix terser reserved symbols for worker (#15864) [JSEP] fix constructor for OrtDevice (#15805) Bump engine.io from 6.4.1 to 6.4.2 in /js/web (#15799) Bump engine.io from 6.4.0 to 6.4.2 in /onnxruntime/test/wasm (#15798) [wasm] revert emsdk to v3.1.19 (#15793) [wasm/JSEP] add threaded build to artifacts (#15777) [js/web] add target ort.webgpu.min.js (#15780) update ort extensions to 94142d8391c9791ec71c38336436319a2d4ac7a0 (#15688) fix: setting builder optimization level to TRT 8.6 default (#15897) Adust GetVersionString() GetBuildInfoString() signatures and move them to OrtApi (#15921) Fix segfault for multiple GPU run (regression) (#15823) android package fix (#15999) [CoreML EP] Minor changes to allow CoreML EP to handle more nodes and models. (#15993) Adding support for conv fp16 fusion on Resnet50v1 (#15474) update onnx release 1.14 for docker files (#15680) Avoid generating training documentation during packaging (#15795) Update Conv-Add-Relu Fusion Transformation (#15834) Fix symbolic shape infer empty value_info (#15842) NhwcFusedConv: Add before Activation (#15837) use __hmul2 instead of __hmul2_rn (#15852) change the EP device to default OrtDevice() for memoryType equals CPU Input (#15903) Fixing NhwcFusedConv fp16 (#15950) fix topo sort in quantization tool (#16003) [doc] add LeakyRelu to coreml supported ops (#15944) [DML EP] Add frequent upload heap flushing (#15960) Co-authored-by: Yulong Wang Co-authored-by: dependabot[bot] Co-authored-by: Guenther Schmuelling Co-authored-by: Shalva Mist Co-authored-by: Maximilian Müller Co-authored-by: Dmitri Smirnov Co-authored-by: pengwa Co-authored-by: Ashwini Khade Co-authored-by: Edward Chen Co-authored-by: Jian Chen Co-authored-by: liqun Fu Co-authored-by: Baiju Meswani Co-authored-by: Tianlei Wu Co-authored-by: Chen Fu Co-authored-by: Ye Wang Co-authored-by: cao lei Co-authored-by: Yufeng Li Co-authored-by: Rachel Guo Co-authored-by: Patrice Vignola
preetha-intel
pushed a commit
to intel/onnxruntime
that referenced
this pull request
Jun 7, 2023
### Description Cherry-picks 26 commits to the release branch. Most cherry-picks are clean merges. Except: 1. When I got conflicts in cgmanifest.json and download-deps.yml, I choose to ignore the conflicts and regenerate the two files 2. There were some conflicts in cmake/deps.txt, onnxruntime_c_api.cc PR list: [js/webgpu] fix Transpose with non-float tensor (microsoft#15819) [js/web] fix terser reserved symbols for worker (microsoft#15864) [JSEP] fix constructor for OrtDevice (microsoft#15805) Bump engine.io from 6.4.1 to 6.4.2 in /js/web (microsoft#15799) Bump engine.io from 6.4.0 to 6.4.2 in /onnxruntime/test/wasm (microsoft#15798) [wasm] revert emsdk to v3.1.19 (microsoft#15793) [wasm/JSEP] add threaded build to artifacts (microsoft#15777) [js/web] add target ort.webgpu.min.js (microsoft#15780) update ort extensions to 94142d8391c9791ec71c38336436319a2d4ac7a0 (microsoft#15688) fix: setting builder optimization level to TRT 8.6 default (microsoft#15897) Adust GetVersionString() GetBuildInfoString() signatures and move them to OrtApi (microsoft#15921) Fix segfault for multiple GPU run (regression) (microsoft#15823) android package fix (microsoft#15999) [CoreML EP] Minor changes to allow CoreML EP to handle more nodes and models. (microsoft#15993) Adding support for conv fp16 fusion on Resnet50v1 (microsoft#15474) update onnx release 1.14 for docker files (microsoft#15680) Avoid generating training documentation during packaging (microsoft#15795) Update Conv-Add-Relu Fusion Transformation (microsoft#15834) Fix symbolic shape infer empty value_info (microsoft#15842) NhwcFusedConv: Add before Activation (microsoft#15837) use __hmul2 instead of __hmul2_rn (microsoft#15852) change the EP device to default OrtDevice() for memoryType equals CPU Input (microsoft#15903) Fixing NhwcFusedConv fp16 (microsoft#15950) fix topo sort in quantization tool (microsoft#16003) [doc] add LeakyRelu to coreml supported ops (microsoft#15944) [DML EP] Add frequent upload heap flushing (microsoft#15960) Co-authored-by: Yulong Wang Co-authored-by: dependabot[bot] Co-authored-by: Guenther Schmuelling Co-authored-by: Shalva Mist Co-authored-by: Maximilian Müller Co-authored-by: Dmitri Smirnov Co-authored-by: pengwa Co-authored-by: Ashwini Khade Co-authored-by: Edward Chen Co-authored-by: Jian Chen Co-authored-by: liqun Fu Co-authored-by: Baiju Meswani Co-authored-by: Tianlei Wu Co-authored-by: Chen Fu Co-authored-by: Ye Wang Co-authored-by: cao lei Co-authored-by: Yufeng Li Co-authored-by: Rachel Guo Co-authored-by: Patrice Vignola
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Fp16 FusedConv and NhwcFusedConv. Fused Add operator should be performed BEFORE the activation operator.
Motivation and Context
Previous understanding of fused conv is incorrect.