Skip to content

Conversation

@jianyuh
Copy link
Member

@jianyuh jianyuh commented Nov 4, 2019

Stack from ghstack:

We would like to optimize LayerNorm with explicit vectorization using Vec256. This PR handles the special part of using fmadd with AVX256.

Differential Revision: D18307639

…ec256 (2/2)

We would like to optimize LayerNorm with explicit vectorization using Vec256. This PR handles the special part of using fmadd with AVX256.

Differential Revision: [D18307639](https://our.internmc.facebook.com/intern/diff/D18307639/)

[ghstack-poisoned]
jianyuh added a commit that referenced this pull request Nov 4, 2019
…ec256 (2/2)

We would like to optimize LayerNorm with explicit vectorization using Vec256. This PR handles the special part of using fmadd with AVX256.

Differential Revision: [D18307639](https://our.internmc.facebook.com/intern/diff/D18307639/)

ghstack-source-id: 93220825
Pull Request resolved: #29154
@jianyuh jianyuh requested a review from jamesr66a November 4, 2019 21:55
…ion using Vec256 (2/2)"

We would like to optimize LayerNorm with explicit vectorization using Vec256. This PR handles the special part of using fmadd with AVX256.

Differential Revision: [D18307639](https://our.internmc.facebook.com/intern/diff/D18307639/)

[ghstack-poisoned]
jianyuh added a commit that referenced this pull request Nov 10, 2019
…ec256 (2/2)

Pull Request resolved: #29154

We would like to optimize LayerNorm with explicit vectorization using Vec256. This PR handles the special part of using fmadd with AVX256.
ghstack-source-id: 93608764

Differential Revision: [D18307639](https://our.internmc.facebook.com/intern/diff/D18307639/)
Copy link
Collaborator

@jamesr66a jamesr66a left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool

const T bias = -rstd_val * mean_val;
for (int64_t j = 0; j < N; ++j) {
for (j = 0; j < N / kVecSize * kVecSize; j += kVecSize) {
const vec256::Vec256<T> gamma_vec = gamma_null
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does the emitted code look like with these conditionals?

@pytorchbot
Copy link
Collaborator

Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as Stale.
Feel free to remove the Stale label if you feel this was a mistake.
Stale pull requests will automatically be closed 30 days after being marked Stale

@github-actions github-actions bot closed this May 12, 2022
@facebook-github-bot facebook-github-bot deleted the gh/jianyuh/45/head branch June 11, 2022 14:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants