-
Notifications
You must be signed in to change notification settings - Fork 26.3k
[quant] Add a quantized batch_norm operator #33080
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Summary: Quantized batch norm for cases where batch norm cannot be fused with conv. AVX2 implementation is from Caffe2. Test Plan: python test/test_quantized.py TestQuantizedOps.test_batch_norm Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
Summary: Quantized batch norm for cases where batch norm cannot be fused with conv. AVX2 implementation is from Caffe2. Test Plan: python test/test_quantized.py TestQuantizedOps.test_batch_norm Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
💊 CircleCI build failures summary and remediationsAs of commit 12507a3: Commit 12507a3 was recently pushed. Waiting for builds... This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.Please report bugs/suggestions on the GitHub issue tracker. This comment has been revised 14 times. |
Summary: Quantized batch norm for cases where batch norm cannot be fused with conv. AVX2 implementation is from Caffe2. Test Plan: python test/test_quantized.py TestQuantizedOps.test_batch_norm Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
raghuramank100
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! A few requests:
aten/src/ATen/native/quantized/cpu/kernels/QuantizedOpKernels.cpp
Outdated
Show resolved
Hide resolved
aten/src/ATen/native/quantized/cpu/kernels/QuantizedOpKernels.cpp
Outdated
Show resolved
Hide resolved
aten/src/ATen/native/quantized/cpu/kernels/QuantizedOpKernels.cpp
Outdated
Show resolved
Hide resolved
aten/src/ATen/native/quantized/cpu/kernels/QuantizedOpKernels.cpp
Outdated
Show resolved
Hide resolved
Summary: Quantized batch norm for cases where batch norm cannot be fused with conv. AVX2 implementation is from Caffe2. Test Plan: python test/test_quantized.py TestQuantizedOps.test_batch_norm Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
Summary: Quantized batch norm for cases where batch norm cannot be fused with conv. AVX2 implementation is from Caffe2. Test Plan: python test/test_quantized.py TestQuantizedOps.test_batch_norm Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
Summary: Quantized batch norm for cases where batch norm cannot be fused with conv. AVX2 implementation is from Caffe2. Test Plan: python test/test_quantized.py TestQuantizedOps.test_batch_norm Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
Summary: Quantized batch norm for cases where batch norm cannot be fused with conv. AVX2 implementation is from Caffe2. Test Plan: python test/test_quantized.py TestQuantizedOps.test_batch_norm Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
| constexpr int kVLen = 8; | ||
| const int outer_size = N * HxW; | ||
| using Vec = Vec256<quint8>; | ||
| for (int i = 0; i < outer_size; ++i) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we might be able to simplify this further, since--correct me if i'm wrong--this is essentially just an elementwise access pattern where we broadcast alpha and beta to all [N, H, W]. It might be pretty simple to just use TensorIterator instead of handling the loop nest explicitly. I can elaborate a bit more later
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Blocked by #33166
Summary: Quantized batch norm for cases where batch norm cannot be fused with conv. AVX2 implementation is from Caffe2. Test Plan: python test/test_quantized.py TestQuantizedOps.test_batch_norm Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
Summary: Quantized batch norm for cases where batch norm cannot be fused with conv. AVX2 implementation is from Caffe2. Test Plan: python test/test_quantized.py TestQuantizedOps.test_batch_norm Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
jamesr66a
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LG, just one comment
aten/src/ATen/native/quantized/cpu/kernels/QuantizedOpKernels.cpp
Outdated
Show resolved
Hide resolved
Summary: Quantized batch norm for cases where batch norm cannot be fused with conv. AVX2 implementation is from Caffe2. Test Plan: python test/test_quantized.py TestQuantizedOps.test_batch_norm Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D19861927](https://our.internmc.facebook.com/intern/diff/D19861927) [ghstack-poisoned]
Summary: Quantized batch norm for cases where batch norm cannot be fused with conv. AVX2 implementation is from Caffe2. Test Plan: python test/test_quantized.py TestQuantizedOps.test_batch_norm Reviewers: Subscribers: Tasks: Tags: Differential Revision: [D19861927](https://our.internmc.facebook.com/intern/diff/D19861927) [ghstack-poisoned]
|
This pull request has been merged in d043560. |
Summary: Pull Request resolved: pytorch#33080 Quantized batch norm for cases where batch norm cannot be fused with conv. AVX2 implementation is from Caffe2. Test Plan: python test/test_quantized.py TestQuantizedOps.test_batch_norm Imported from OSS Differential Revision: D19861927 fbshipit-source-id: bd8cd101fc063cb6358132ab7c651a160999293c
Summary: Adds a quantized implementation of LayerNorm for server. Relevant PRs: * #20345 (floating point LN) * #33080 (quantized BN) A future PR will add the Python wrapper. Test Plan: numerics match the floating point implementation TODO: benchmarks Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
Summary: Adds a quantized implementation of LayerNorm for server. Relevant PRs: * #20345 (floating point LN) * #33080 (quantized BN) A future PR will add the Python wrapper. Test Plan: numerics match the floating point implementation TODO: benchmarks Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
Summary: Adds a quantized implementation of LayerNorm for server. Relevant PRs: * #20345 (floating point LN) * #33080 (quantized BN) A future PR will add the Python wrapper. Test Plan: numerics match the floating point implementation TODO: benchmarks Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
Summary: Adds a quantized implementation of LayerNorm for server. Relevant PRs: * #20345 (floating point LN) * #33080 (quantized BN) A future PR will add the Python wrapper. Test Plan: numerics match the floating point implementation TODO: benchmarks Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 3c3721f Pull Request resolved: #35329
Stack from ghstack:
Summary:
Quantized batch norm for cases where batch norm cannot be fused with conv.
AVX2 implementation is from Caffe2.
Test Plan:
python test/test_quantized.py TestQuantizedOps.test_batch_norm
Reviewers:
Subscribers:
Tasks:
Tags:
Differential Revision: D19861927