Adding support for conv fp16 fusion on Resnet50v1 by jchen351 · Pull Request #15474 · microsoft/onnxruntime

jchen351 · 2023-04-12T00:31:26Z

Description

Adding support for conv fp16 fusion with Conv-Add and Conv-Add-act. Specifically tested on on Resnet50v1

Motivation and Context

Adding support for conv fp16 fusion with Conv-Add and Conv-Add-act. Specifically tested on on Resnet50v1

snnn · 2023-04-13T03:28:10Z

/azp run Linux CPU CI Pipeline

azure-pipelines · 2023-04-13T03:28:20Z

Azure Pipelines successfully started running 1 pipeline(s).

snnn · 2023-04-13T23:47:50Z

/azp run Linux CPU CI Pipeline

snnn · 2023-04-13T23:47:58Z

/azp run Linux CPU ATen Pipeline

azure-pipelines · 2023-04-13T23:48:00Z

Azure Pipelines successfully started running 1 pipeline(s).

azure-pipelines · 2023-04-13T23:48:07Z

Azure Pipelines successfully started running 1 pipeline(s).

…is available

jchen351 · 2023-04-14T18:07:00Z

The result is resnet50_fp16_fused.onnx

onnxruntime/test/optimizer/resnet50_fusion_test.cc

chenfucn · 2023-04-17T18:03:28Z

Description

Adding support for conv fp16 fusion on Resnet50v1

Motivation and Context

Adding support for conv fp16 fusion on Resnet50v1

This is way too vague. First the graph optimizer should not be targeted to one specific model, it should work for all models. Secondly it does not specify enough detail about what kind of fusion it performs, what operators are involved? Pre conditions? Result of the fusion?

onnxruntime/test/optimizer/resnet50_fusion_test.cc

onnxruntime/core/optimizer/conv_add_act_fusion.cc

onnxruntime/core/optimizer/conv_activation_fusion.cc

chenfucn

This change is way too simplistic. Onnxruntime's key competitive edge is one model, deploy everywhere. The complexity of deploying to various CPUs with different features are handled by CPU EP. And this is of vital importance for our team.

onnxruntime/test/optimizer/resnet50_fusion_test.cc

onnxruntime/core/optimizer/conv_activation_fusion.cc

onnxruntime/core/optimizer/conv_add_act_fusion.cc

### Description Adding support for conv fp16 fusion with Conv-Add and Conv-Add-act. Specifically tested on on Resnet50v1 ### Motivation and Context Adding support for conv fp16 fusion with Conv-Add and Conv-Add-act. Specifically tested on on Resnet50v1

### Description Cherry-picks 26 commits to the release branch. Most cherry-picks are clean merges. Except: 1. When I got conflicts in cgmanifest.json and download-deps.yml, I choose to ignore the conflicts and regenerate the two files 2. There were some conflicts in cmake/deps.txt, onnxruntime_c_api.cc PR list: [js/webgpu] fix Transpose with non-float tensor (#15819) [js/web] fix terser reserved symbols for worker (#15864) [JSEP] fix constructor for OrtDevice (#15805) Bump engine.io from 6.4.1 to 6.4.2 in /js/web (#15799) Bump engine.io from 6.4.0 to 6.4.2 in /onnxruntime/test/wasm (#15798) [wasm] revert emsdk to v3.1.19 (#15793) [wasm/JSEP] add threaded build to artifacts (#15777) [js/web] add target ort.webgpu.min.js (#15780) update ort extensions to 94142d8391c9791ec71c38336436319a2d4ac7a0 (#15688) fix: setting builder optimization level to TRT 8.6 default (#15897) Adust GetVersionString() GetBuildInfoString() signatures and move them to OrtApi (#15921) Fix segfault for multiple GPU run (regression) (#15823) android package fix (#15999) [CoreML EP] Minor changes to allow CoreML EP to handle more nodes and models. (#15993) Adding support for conv fp16 fusion on Resnet50v1 (#15474) update onnx release 1.14 for docker files (#15680) Avoid generating training documentation during packaging (#15795) Update Conv-Add-Relu Fusion Transformation (#15834) Fix symbolic shape infer empty value_info (#15842) NhwcFusedConv: Add before Activation (#15837) use __hmul2 instead of __hmul2_rn (#15852) change the EP device to default OrtDevice() for memoryType equals CPU Input (#15903) Fixing NhwcFusedConv fp16 (#15950) fix topo sort in quantization tool (#16003) [doc] add LeakyRelu to coreml supported ops (#15944) [DML EP] Add frequent upload heap flushing (#15960) Co-authored-by: Yulong Wang Co-authored-by: dependabot[bot] Co-authored-by: Guenther Schmuelling Co-authored-by: Shalva Mist Co-authored-by: Maximilian Müller Co-authored-by: Dmitri Smirnov Co-authored-by: pengwa Co-authored-by: Ashwini Khade Co-authored-by: Edward Chen Co-authored-by: Jian Chen Co-authored-by: liqun Fu Co-authored-by: Baiju Meswani Co-authored-by: Tianlei Wu Co-authored-by: Chen Fu Co-authored-by: Ye Wang Co-authored-by: cao lei Co-authored-by: Yufeng Li Co-authored-by: Rachel Guo Co-authored-by: Patrice Vignola

### Description Cherry-picks 26 commits to the release branch. Most cherry-picks are clean merges. Except: 1. When I got conflicts in cgmanifest.json and download-deps.yml, I choose to ignore the conflicts and regenerate the two files 2. There were some conflicts in cmake/deps.txt, onnxruntime_c_api.cc PR list: [js/webgpu] fix Transpose with non-float tensor (microsoft#15819) [js/web] fix terser reserved symbols for worker (microsoft#15864) [JSEP] fix constructor for OrtDevice (microsoft#15805) Bump engine.io from 6.4.1 to 6.4.2 in /js/web (microsoft#15799) Bump engine.io from 6.4.0 to 6.4.2 in /onnxruntime/test/wasm (microsoft#15798) [wasm] revert emsdk to v3.1.19 (microsoft#15793) [wasm/JSEP] add threaded build to artifacts (microsoft#15777) [js/web] add target ort.webgpu.min.js (microsoft#15780) update ort extensions to 94142d8391c9791ec71c38336436319a2d4ac7a0 (microsoft#15688) fix: setting builder optimization level to TRT 8.6 default (microsoft#15897) Adust GetVersionString() GetBuildInfoString() signatures and move them to OrtApi (microsoft#15921) Fix segfault for multiple GPU run (regression) (microsoft#15823) android package fix (microsoft#15999) [CoreML EP] Minor changes to allow CoreML EP to handle more nodes and models. (microsoft#15993) Adding support for conv fp16 fusion on Resnet50v1 (microsoft#15474) update onnx release 1.14 for docker files (microsoft#15680) Avoid generating training documentation during packaging (microsoft#15795) Update Conv-Add-Relu Fusion Transformation (microsoft#15834) Fix symbolic shape infer empty value_info (microsoft#15842) NhwcFusedConv: Add before Activation (microsoft#15837) use __hmul2 instead of __hmul2_rn (microsoft#15852) change the EP device to default OrtDevice() for memoryType equals CPU Input (microsoft#15903) Fixing NhwcFusedConv fp16 (microsoft#15950) fix topo sort in quantization tool (microsoft#16003) [doc] add LeakyRelu to coreml supported ops (microsoft#15944) [DML EP] Add frequent upload heap flushing (microsoft#15960) Co-authored-by: Yulong Wang Co-authored-by: dependabot[bot] Co-authored-by: Guenther Schmuelling Co-authored-by: Shalva Mist Co-authored-by: Maximilian Müller Co-authored-by: Dmitri Smirnov Co-authored-by: pengwa Co-authored-by: Ashwini Khade Co-authored-by: Edward Chen Co-authored-by: Jian Chen Co-authored-by: liqun Fu Co-authored-by: Baiju Meswani Co-authored-by: Tianlei Wu Co-authored-by: Chen Fu Co-authored-by: Ye Wang Co-authored-by: cao lei Co-authored-by: Yufeng Li Co-authored-by: Rachel Guo Co-authored-by: Patrice Vignola

jchen351 added 7 commits April 6, 2023 10:46

adiing test for resnet50

1448c1f

Merge branch 'main' into Cjian/conv_fp16_fusion

c1da138

adding test for resnet50

3288575

Reverting back graph_transform_test.cc

6f6aafd

Adding fp16 support for Resnet50 Fusion

119ef5c

Clean up code

50678f7

Change folder structure

b2e13a6

jchen351 added 2 commits April 14, 2023 10:08

Merge branch 'main' into Cjian/conv_fp16_fusion

ab972dc

Conditionally run the test when resnet50.onnx and resnet50.fp16.onnx …

2326be1

…is available

Resume the condition for cuda provider

1d5a835

snnn reviewed Apr 16, 2023

View reviewed changes

onnxruntime/test/optimizer/resnet50_fusion_test.cc Outdated Show resolved Hide resolved

Update Test condition

599e73d

jchen351 marked this pull request as ready for review April 17, 2023 17:58

chenfucn reviewed Apr 17, 2023

View reviewed changes

onnxruntime/test/optimizer/resnet50_fusion_test.cc Show resolved Hide resolved

chenfucn reviewed Apr 17, 2023

View reviewed changes

onnxruntime/core/optimizer/conv_add_act_fusion.cc Outdated Show resolved Hide resolved

chenfucn reviewed Apr 17, 2023

View reviewed changes

onnxruntime/core/optimizer/conv_activation_fusion.cc Show resolved Hide resolved

chenfucn requested changes Apr 17, 2023

View reviewed changes

snnn reviewed Apr 19, 2023

View reviewed changes

onnxruntime/test/optimizer/resnet50_fusion_test.cc Outdated Show resolved Hide resolved

jchen351 added 2 commits April 19, 2023 15:44

Update to Resnet50_Fusion_Testing_fp16

1a460b4

Update to Resnet50_Fusion_Testing_fp16

1f66e24

github-advanced-security bot found potential problems Apr 19, 2023

View reviewed changes

onnxruntime/test/optimizer/resnet50_fusion_test.cc Fixed Show fixed Hide fixed

snnn reviewed Apr 19, 2023

View reviewed changes

onnxruntime/test/optimizer/resnet50_fusion_test.cc Outdated Show resolved Hide resolved

Update to Resnet50_Fusion_Testing_fp16

3cfa8e5

github-advanced-security bot found potential problems Apr 20, 2023

View reviewed changes

onnxruntime/test/optimizer/resnet50_fusion_test.cc Fixed Show fixed Hide fixed

jchen351 added 5 commits May 1, 2023 10:57

Merge branch 'main' into Cjian/conv_fp16_fusion

1272a01

Wrapping fusion with condition that only when fp16 is supported

2f02cea

Update test

a13e3cb

Reformat code

d490ee6

Adding unit tests.

96b2106

jchen351 requested a review from chenfucn May 2, 2023 02:09

Add checking KernelRegistry

df35599

chenfucn reviewed May 2, 2023

View reviewed changes

onnxruntime/core/optimizer/conv_activation_fusion.cc Outdated Show resolved Hide resolved

yufenglee reviewed May 2, 2023

View reviewed changes

onnxruntime/core/optimizer/conv_add_act_fusion.cc Outdated Show resolved Hide resolved

Rolling back to iteration 19

d125343

chenfucn previously approved these changes May 2, 2023

View reviewed changes

Linter update

141b7f6

jchen351 dismissed chenfucn’s stale review via 141b7f6 May 2, 2023 18:24

yufenglee approved these changes May 3, 2023

View reviewed changes

yufenglee merged commit 5eedd88 into main May 3, 2023

yufenglee deleted the Cjian/conv_fp16_fusion branch May 3, 2023 22:48

jchen351 added the release:1.15 label May 3, 2023

snnn added the triage:approved Approved for cherrypicks for release label May 18, 2023

yufenglee mentioned this pull request May 19, 2023

Cherry-picks to the release branch #16017

Merged

snnn removed triage:approved Approved for cherrypicks for release release:1.15 labels May 19, 2023

Comments

Conversation

jchen351 commented Apr 12, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Motivation and Context

Uh oh!

snnn commented Apr 13, 2023

Uh oh!

azure-pipelines bot commented Apr 13, 2023

Uh oh!

snnn commented Apr 13, 2023

Uh oh!

snnn commented Apr 13, 2023

Uh oh!

azure-pipelines bot commented Apr 13, 2023

Uh oh!

azure-pipelines bot commented Apr 13, 2023

Uh oh!

jchen351 commented Apr 14, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

chenfucn commented Apr 17, 2023

Description

Motivation and Context

Uh oh!

Uh oh!

Uh oh!

Uh oh!

chenfucn left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

jchen351 commented Apr 12, 2023 •

edited

Loading

jchen351 commented Apr 14, 2023 •

edited

Loading