[Pytorch][AO] Update choose_qparams_per_token op to output correct shape for scales and zp #136807

kimishpatel · 2024-09-27T00:00:39Z

Stack from ghstack (oldest at bottom):

-> [Pytorch][AO] Update choose_qparams_per_token op to output correct shape for scales and zp #136807

also makes scales and zp dtype reconcile with meta impl as well as other
quantized ops representation of scales and zero point
make sure qunatize_per_token's output_dtype is respected

There are a few places where we need to reconcile on scale and zero point dtype
but that will come later. This fixes are mainly being done to enable quantized
kv cache though ET stack

Differential Revision: D62301840

…ape for scales and zp - also makes scales and zp dtype reconcile with meta impl as well as other quantized ops representation of scales and zero point - make sure qunatize_per_token's output_dtype is respected There are a few places where we need to reconcile on scale and zero point dtype but that will come later. This fixes are mainly being done to enable quantized kv cache though ET stack Differential Revision: [D62301840](https://our.internmc.facebook.com/intern/diff/D62301840/) [ghstack-poisoned]

pytorch-bot · 2024-09-27T00:00:43Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/136807

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 Cancelled Job

As of commit 4e93552 with merge base 2421344 ():

CANCELLED JOB - The following job was cancelled. Please retry:

trunk / linux-manylinux-2_28-py3-cpu-s390x / build (gh)
##[error]The operation was canceled.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

facebook-github-bot · 2024-09-27T00:00:49Z

This pull request was exported from Phabricator. Differential Revision: D62301840

jerryzh168 · 2024-09-27T00:26:13Z

torch/ao/quantization/fx/_decomposed.py

       input (torch.Tensor): quantized Tensor (uint8, int8 etc.)
-       scales (float32 torch.Tensor): quantization parameter for per token affine quantization
+       scales (float64 torch.Tensor): quantization parameter for per token affine quantization
       zero_points (int32 torch.Tensor): quantization parameter for per token affine quantization


should zero_points be int64?

… correct shape for scales and zp" - also makes scales and zp dtype reconcile with meta impl as well as other quantized ops representation of scales and zero point - make sure qunatize_per_token's output_dtype is respected There are a few places where we need to reconcile on scale and zero point dtype but that will come later. This fixes are mainly being done to enable quantized kv cache though ET stack Differential Revision: [D62301840](https://our.internmc.facebook.com/intern/diff/D62301840/) [ghstack-poisoned]

facebook-github-bot · 2024-09-27T14:38:14Z

This pull request was exported from Phabricator. Differential Revision: D62301840

…ape for scales and zp Pull Request resolved: #136807 - also makes scales and zp dtype reconcile with meta impl as well as other quantized ops representation of scales and zero point - make sure qunatize_per_token's output_dtype is respected There are a few places where we need to reconcile on scale and zero point dtype but that will come later. This fixes are mainly being done to enable quantized kv cache though ET stack Differential Revision: [D62301840](https://our.internmc.facebook.com/intern/diff/D62301840/) ghstack-source-id: 245083869

facebook-github-bot · 2024-09-27T18:44:35Z

@pytorchbot merge -f 'Landed internally'

(Initiating merge automatically since Phabricator Diff has merged, using force because this PR might not pass merge_rules.json but landed internally)

pytorchmergebot · 2024-09-27T18:46:10Z

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use -f as last resort and instead consider -i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

Summary: Fixes: pytorch#970 The test was broken by a recent refactor in pytorch: pytorch/pytorch#136807 Test Plan: CI Reviewers: Subscribers: Tasks: Tags:

* Unskip `test_choose_qparams_token_asym` in 2.6 Summary: Fixes: #970 The test was broken by a recent refactor in pytorch: pytorch/pytorch#136807 Test Plan: CI Reviewers: Subscribers: Tasks: Tags: * fix

* Unskip `test_choose_qparams_token_asym` in 2.6 Summary: Fixes: pytorch#970 The test was broken by a recent refactor in pytorch: pytorch/pytorch#136807 Test Plan: CI Reviewers: Subscribers: Tasks: Tags: * fix

pytorch-bot bot added release notes: quantization release notes category release notes: AO frontend labels Sep 27, 2024

facebook-github-bot added the fb-exported label Sep 27, 2024

jerryzh168 approved these changes Sep 27, 2024

View reviewed changes

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Sep 27, 2024

jerryzh168 reviewed Sep 27, 2024

View reviewed changes

pytorchmergebot added the merging label Sep 27, 2024

pytorchmergebot closed this in e5a5793 Sep 27, 2024

pytorchmergebot added Merged and removed merging labels Sep 27, 2024

jerryzh168 mentioned this pull request Sep 29, 2024

test_choose_qparams_token_asym failing in nightly pytorch/ao#970

Closed

jerryzh168 mentioned this pull request Oct 3, 2024

Unskip test_choose_qparams_token_asym in 2.6 pytorch/ao#1004

Merged

github-actions bot deleted the gh/kimishpatel/185/head branch October 28, 2024 02:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Pytorch][AO] Update choose_qparams_per_token op to output correct shape for scales and zp #136807

[Pytorch][AO] Update choose_qparams_per_token op to output correct shape for scales and zp #136807

Uh oh!

kimishpatel commented Sep 27, 2024 •

edited

Loading

Uh oh!

pytorch-bot bot commented Sep 27, 2024 •

edited

Loading

Uh oh!

facebook-github-bot commented Sep 27, 2024

Uh oh!

jerryzh168 Sep 27, 2024

Uh oh!

facebook-github-bot commented Sep 27, 2024

Uh oh!

facebook-github-bot commented Sep 27, 2024

Uh oh!

pytorchmergebot commented Sep 27, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[Pytorch][AO] Update choose_qparams_per_token op to output correct shape for scales and zp #136807

[Pytorch][AO] Update choose_qparams_per_token op to output correct shape for scales and zp #136807

Uh oh!

Conversation

kimishpatel commented Sep 27, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Sep 27, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/136807

❌ 1 Cancelled Job

Uh oh!

facebook-github-bot commented Sep 27, 2024

Uh oh!

jerryzh168 Sep 27, 2024

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot commented Sep 27, 2024

Uh oh!

facebook-github-bot commented Sep 27, 2024

Uh oh!

pytorchmergebot commented Sep 27, 2024

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

kimishpatel commented Sep 27, 2024 •

edited

Loading

pytorch-bot bot commented Sep 27, 2024 •

edited

Loading