Skip to content

Comments

Adding support for conv fp16 fusion on Resnet50v1#15474

Merged
yufenglee merged 22 commits intomainfrom
Cjian/conv_fp16_fusion
May 3, 2023
Merged

Adding support for conv fp16 fusion on Resnet50v1#15474
yufenglee merged 22 commits intomainfrom
Cjian/conv_fp16_fusion

Conversation

@jchen351
Copy link
Contributor

@jchen351 jchen351 commented Apr 12, 2023

Description

Adding support for conv fp16 fusion with Conv-Add and Conv-Add-act. Specifically tested on on Resnet50v1

Motivation and Context

Adding support for conv fp16 fusion with Conv-Add and Conv-Add-act. Specifically tested on on Resnet50v1

@snnn
Copy link
Contributor

snnn commented Apr 13, 2023

/azp run Linux CPU CI Pipeline

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@snnn
Copy link
Contributor

snnn commented Apr 13, 2023

/azp run Linux CPU CI Pipeline

@snnn
Copy link
Contributor

snnn commented Apr 13, 2023

/azp run Linux CPU ATen Pipeline

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

1 similar comment
@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@jchen351
Copy link
Contributor Author

jchen351 commented Apr 14, 2023

The result is resnet50_fp16_fused.onnx
image

@jchen351 jchen351 marked this pull request as ready for review April 17, 2023 17:58
@chenfucn
Copy link
Contributor

Description

Adding support for conv fp16 fusion on Resnet50v1

Motivation and Context

Adding support for conv fp16 fusion on Resnet50v1

This is way too vague. First the graph optimizer should not be targeted to one specific model, it should work for all models. Secondly it does not specify enough detail about what kind of fusion it performs, what operators are involved? Pre conditions? Result of the fusion?

Copy link
Contributor

@chenfucn chenfucn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is way too simplistic. Onnxruntime's key competitive edge is one model, deploy everywhere. The complexity of deploying to various CPUs with different features are handled by CPU EP. And this is of vital importance for our team.

@jchen351 jchen351 requested a review from chenfucn May 2, 2023 02:09
chenfucn
chenfucn previously approved these changes May 2, 2023
@yufenglee yufenglee merged commit 5eedd88 into main May 3, 2023
@yufenglee yufenglee deleted the Cjian/conv_fp16_fusion branch May 3, 2023 22:48
ShukantPal pushed a commit to ShukantPal/onnxruntime that referenced this pull request May 7, 2023
### Description
Adding support for conv fp16 fusion with Conv-Add and Conv-Add-act.
Specifically tested on on Resnet50v1



### Motivation and Context
Adding support for conv fp16 fusion with Conv-Add and Conv-Add-act.
Specifically tested on on Resnet50v1
prathikr pushed a commit that referenced this pull request May 16, 2023
### Description
Adding support for conv fp16 fusion with Conv-Add and Conv-Add-act.
Specifically tested on on Resnet50v1



### Motivation and Context
Adding support for conv fp16 fusion with Conv-Add and Conv-Add-act.
Specifically tested on on Resnet50v1
@snnn snnn added the triage:approved Approved for cherrypicks for release label May 18, 2023
snnn pushed a commit that referenced this pull request May 19, 2023
### Description
Adding support for conv fp16 fusion with Conv-Add and Conv-Add-act.
Specifically tested on on Resnet50v1



### Motivation and Context
Adding support for conv fp16 fusion with Conv-Add and Conv-Add-act.
Specifically tested on on Resnet50v1
snnn pushed a commit that referenced this pull request May 19, 2023
### Description
Adding support for conv fp16 fusion with Conv-Add and Conv-Add-act.
Specifically tested on on Resnet50v1



### Motivation and Context
Adding support for conv fp16 fusion with Conv-Add and Conv-Add-act.
Specifically tested on on Resnet50v1
snnn pushed a commit that referenced this pull request May 19, 2023
### Description
Cherry-picks 26 commits to the release branch. 
Most cherry-picks are clean merges. Except:

1. When I got conflicts in cgmanifest.json and download-deps.yml, I
choose to ignore the conflicts and regenerate the two files
2. There were some conflicts in cmake/deps.txt, onnxruntime_c_api.cc


PR list:

[js/webgpu] fix Transpose with non-float tensor (#15819)
[js/web] fix terser reserved symbols for worker (#15864)
[JSEP] fix constructor for OrtDevice (#15805)
Bump engine.io from 6.4.1 to 6.4.2 in /js/web (#15799)
Bump engine.io from 6.4.0 to 6.4.2 in /onnxruntime/test/wasm (#15798)
[wasm] revert emsdk to v3.1.19 (#15793)
[wasm/JSEP] add threaded build to artifacts (#15777)
[js/web] add target ort.webgpu.min.js (#15780)
update ort extensions to 94142d8391c9791ec71c38336436319a2d4ac7a0 (#15688)
fix: setting builder optimization level to TRT 8.6 default (#15897)
Adust GetVersionString() GetBuildInfoString() signatures and move them to OrtApi (#15921)
Fix segfault for multiple GPU run (regression) (#15823)
android package fix (#15999)
[CoreML EP] Minor changes to allow CoreML EP to handle more nodes and models. (#15993)
Adding support for conv fp16 fusion on Resnet50v1 (#15474)
update onnx release 1.14 for docker files (#15680)
Avoid generating training documentation during packaging (#15795)
Update Conv-Add-Relu Fusion Transformation (#15834)
Fix symbolic shape infer empty value_info (#15842)
NhwcFusedConv: Add before Activation (#15837)
use __hmul2 instead of __hmul2_rn (#15852)
change the EP device to default OrtDevice() for memoryType equals CPU Input (#15903)
Fixing NhwcFusedConv fp16 (#15950)
fix topo sort in quantization tool (#16003)
[doc] add LeakyRelu to coreml supported ops (#15944)
[DML EP] Add frequent upload heap flushing (#15960)

Co-authored-by: Yulong Wang 
Co-authored-by: dependabot[bot] 
Co-authored-by: Guenther Schmuelling 
Co-authored-by: Shalva Mist 
Co-authored-by: Maximilian Müller 
Co-authored-by: Dmitri Smirnov 
Co-authored-by: pengwa 
Co-authored-by: Ashwini Khade 
Co-authored-by: Edward Chen 
Co-authored-by: Jian Chen 
Co-authored-by: liqun Fu 
Co-authored-by: Baiju Meswani 
Co-authored-by: Tianlei Wu 
Co-authored-by: Chen Fu 
Co-authored-by: Ye Wang 
Co-authored-by: cao lei 
Co-authored-by: Yufeng Li 
Co-authored-by: Rachel Guo 
Co-authored-by: Patrice Vignola
@snnn snnn removed triage:approved Approved for cherrypicks for release release:1.15 labels May 19, 2023
preetha-intel pushed a commit to intel/onnxruntime that referenced this pull request Jun 7, 2023
### Description
Cherry-picks 26 commits to the release branch. 
Most cherry-picks are clean merges. Except:

1. When I got conflicts in cgmanifest.json and download-deps.yml, I
choose to ignore the conflicts and regenerate the two files
2. There were some conflicts in cmake/deps.txt, onnxruntime_c_api.cc


PR list:

[js/webgpu] fix Transpose with non-float tensor (microsoft#15819)
[js/web] fix terser reserved symbols for worker (microsoft#15864)
[JSEP] fix constructor for OrtDevice (microsoft#15805)
Bump engine.io from 6.4.1 to 6.4.2 in /js/web (microsoft#15799)
Bump engine.io from 6.4.0 to 6.4.2 in /onnxruntime/test/wasm (microsoft#15798)
[wasm] revert emsdk to v3.1.19 (microsoft#15793)
[wasm/JSEP] add threaded build to artifacts (microsoft#15777)
[js/web] add target ort.webgpu.min.js (microsoft#15780)
update ort extensions to 94142d8391c9791ec71c38336436319a2d4ac7a0 (microsoft#15688)
fix: setting builder optimization level to TRT 8.6 default (microsoft#15897)
Adust GetVersionString() GetBuildInfoString() signatures and move them to OrtApi (microsoft#15921)
Fix segfault for multiple GPU run (regression) (microsoft#15823)
android package fix (microsoft#15999)
[CoreML EP] Minor changes to allow CoreML EP to handle more nodes and models. (microsoft#15993)
Adding support for conv fp16 fusion on Resnet50v1 (microsoft#15474)
update onnx release 1.14 for docker files (microsoft#15680)
Avoid generating training documentation during packaging (microsoft#15795)
Update Conv-Add-Relu Fusion Transformation (microsoft#15834)
Fix symbolic shape infer empty value_info (microsoft#15842)
NhwcFusedConv: Add before Activation (microsoft#15837)
use __hmul2 instead of __hmul2_rn (microsoft#15852)
change the EP device to default OrtDevice() for memoryType equals CPU Input (microsoft#15903)
Fixing NhwcFusedConv fp16 (microsoft#15950)
fix topo sort in quantization tool (microsoft#16003)
[doc] add LeakyRelu to coreml supported ops (microsoft#15944)
[DML EP] Add frequent upload heap flushing (microsoft#15960)

Co-authored-by: Yulong Wang 
Co-authored-by: dependabot[bot] 
Co-authored-by: Guenther Schmuelling 
Co-authored-by: Shalva Mist 
Co-authored-by: Maximilian Müller 
Co-authored-by: Dmitri Smirnov 
Co-authored-by: pengwa 
Co-authored-by: Ashwini Khade 
Co-authored-by: Edward Chen 
Co-authored-by: Jian Chen 
Co-authored-by: liqun Fu 
Co-authored-by: Baiju Meswani 
Co-authored-by: Tianlei Wu 
Co-authored-by: Chen Fu 
Co-authored-by: Ye Wang 
Co-authored-by: cao lei 
Co-authored-by: Yufeng Li 
Co-authored-by: Rachel Guo 
Co-authored-by: Patrice Vignola
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants