[bert/RoBERTa] Optimize LayerNorm with explicit vectorization using Vec256 (2/2) #29154

jianyuh · 2019-11-04T21:53:49Z

Stack from ghstack:

[bert/RoBERTa] Optimize LayerNorm (Backward) with explicit vectorization using Vec256 #29519 [bert/RoBERTa] Optimize LayerNorm (Backward) with explicit vectorization using Vec256
[bert/RoBERTa] Optimize LayerNorm with explicit vectorization using Vec256 (2/2) #29154 [bert/RoBERTa] Optimize LayerNorm with explicit vectorization using Vec256 (2/2)
[bert/RoBERTa] Optimize LayerNorm with explicit vectorization using Vec256 #29104 [bert/RoBERTa] Optimize LayerNorm with explicit vectorization using Vec256

We would like to optimize LayerNorm with explicit vectorization using Vec256. This PR handles the special part of using fmadd with AVX256.

Differential Revision: D18307639

…ec256 (2/2) We would like to optimize LayerNorm with explicit vectorization using Vec256. This PR handles the special part of using fmadd with AVX256. Differential Revision: [D18307639](https://our.internmc.facebook.com/intern/diff/D18307639/) [ghstack-poisoned]

…ec256 (2/2) We would like to optimize LayerNorm with explicit vectorization using Vec256. This PR handles the special part of using fmadd with AVX256. Differential Revision: [D18307639](https://our.internmc.facebook.com/intern/diff/D18307639/) ghstack-source-id: 93220825 Pull Request resolved: #29154

…ion using Vec256 (2/2)" We would like to optimize LayerNorm with explicit vectorization using Vec256. This PR handles the special part of using fmadd with AVX256. Differential Revision: [D18307639](https://our.internmc.facebook.com/intern/diff/D18307639/) [ghstack-poisoned]

…ec256 (2/2) Pull Request resolved: #29154 We would like to optimize LayerNorm with explicit vectorization using Vec256. This PR handles the special part of using fmadd with AVX256. ghstack-source-id: 93608764 Differential Revision: [D18307639](https://our.internmc.facebook.com/intern/diff/D18307639/)

jamesr66a

cool

jamesr66a · 2019-11-20T19:46:26Z

aten/src/ATen/native/cpu/layer_norm_kernel.cpp

      const T bias = -rstd_val * mean_val;
-      for (int64_t j = 0; j < N; ++j) {
+      for (j = 0; j < N / kVecSize * kVecSize; j += kVecSize) {
+        const vec256::Vec256<T> gamma_vec = gamma_null


What does the emitted code look like with these conditionals?

pytorchbot · 2022-04-12T02:37:35Z

Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as Stale.
Feel free to remove the Stale label if you feel this was a mistake.
Stale pull requests will automatically be closed 30 days after being marked Stale

This was referenced Nov 4, 2019

[pt][aten] Enable the intra-op parallelism for layer norm #28464

Closed

[bert/RoBERTa] Optimize LayerNorm with explicit vectorization using Vec256 #29104

Closed

jianyuh requested a review from jamesr66a November 4, 2019 21:55

jianyuh mentioned this pull request Nov 10, 2019

[bert/RoBERTa] Optimize LayerNorm (Backward) with explicit vectorization using Vec256 #29519

Closed

jamesr66a approved these changes Nov 20, 2019

View reviewed changes

facebook-github-bot added the cla signed label Oct 30, 2020

pytorchbot added the Stale label Apr 12, 2022

github-actions bot closed this May 12, 2022

facebook-github-bot deleted the gh/jianyuh/45/head branch June 11, 2022 14:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[bert/RoBERTa] Optimize LayerNorm with explicit vectorization using Vec256 (2/2) #29154

[bert/RoBERTa] Optimize LayerNorm with explicit vectorization using Vec256 (2/2) #29154

Uh oh!

jianyuh commented Nov 4, 2019 •

edited

Loading

Uh oh!

jamesr66a left a comment

Uh oh!

jamesr66a Nov 20, 2019

Uh oh!

pytorchbot commented Apr 12, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

[bert/RoBERTa] Optimize LayerNorm with explicit vectorization using Vec256 (2/2) #29154

[bert/RoBERTa] Optimize LayerNorm with explicit vectorization using Vec256 (2/2) #29154

Uh oh!

Conversation

jianyuh commented Nov 4, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jamesr66a left a comment

Choose a reason for hiding this comment

Uh oh!

jamesr66a Nov 20, 2019

Choose a reason for hiding this comment

Uh oh!

pytorchbot commented Apr 12, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

jianyuh commented Nov 4, 2019 •

edited

Loading