[pt][quant] Update the misleading comments for zero_points and scale in dynamic quant linear module #28767

jianyuh · 2019-10-27T22:33:22Z

Stack from ghstack:

[pt][quant] Avoid the misleading zero_point and scale [2/2] #28827 [pt][quant] Avoid the misleading zero_point and scale [2/2]
[pt][quant] Update the misleading comments for zero_points and scale in dynamic quant linear module #28767 [pt][quant] Update the misleading comments for zero_points and scale in dynamic quant linear module

The scale and zero_point are for the output activation tensor, not for the weight tensor;

Differential Revision: D18164949

…in dynamic quant linear module The scale and zero_point are for the output activation tensor, not for the weight tensor; Differential Revision: [D18164949](https://our.internmc.facebook.com/intern/diff/D18164949/) [ghstack-poisoned]

…in dynamic quant linear module The scale and zero_point are for the output activation tensor, not for the weight tensor; Differential Revision: [D18164949](https://our.internmc.facebook.com/intern/diff/D18164949/) ghstack-source-id: 92712041 Pull Request resolved: #28767

z-a-f

LGTM

z-a-f · 2019-10-28T20:19:24Z

torch/nn/quantized/dynamic/modules/linear.py

-        scale: `scale` parameter of weight Quantized Tensor, type: double
-        zero_point: `zero_point` parameter for weight Quantized Tensor, type: long
+        scale: `scale` parameter of output activation Quantized Tensor, type: double
+        zero_point: `zero_point` parameter for output activation Quantized Tensor, type: long


There is no reason to capitalize the "Quantized Tensor" -- let's keep it as "quantized tensor"

raghuramank100

I think this comment is misleading, please see my comments below

raghuramank100 · 2019-10-28T20:22:45Z

torch/nn/quantized/dynamic/modules/linear.py

                If :attr:`bias` is ``True``, the values are initialized to zero.
-        scale: `scale` parameter of weight Quantized Tensor, type: double
-        zero_point: `zero_point` parameter for weight Quantized Tensor, type: long
+        scale: `scale` parameter of output activation Quantized Tensor, type: double


For dynamic quantization there is no output scale and zero-point. We should not be exposing these in the comments. The activations are in floating point.

We have this PR, mainly because when we print the modules, we print the scale and zero points:
https://github.com/pytorch/pytorch/blob/master/torch/nn/quantized/modules/linear.py#L47-L48

One example is for the RoBERTa model after dynamic quantization:

(19): TransformerEncoderLayer( (dropout): Dropout(p=0.1, inplace=False) (attention): MultiheadAttention( (dropout): Dropout(p=0.1, inplace=False) (input_projection): DynamicQuantizedLinear(in_features=1024, out_features=3072, scale=1.0, zero_point=0) (output_projection): DynamicQuantizedLinear(in_features=1024, out_features=1024, scale=1.0, zero_point=0) ) (residual_mlp): ResidualMLP( (mlp): Sequential( (0): DynamicQuantizedLinear(in_features=1024, out_features=4096, scale=1.0, zero_point=0) (1): GeLU() (2): Dropout(p=0.1, inplace=False) (3): DynamicQuantizedLinear(in_features=4096, out_features=1024, scale=1.0, zero_point=0) (4): Dropout(p=0.1, inplace=False) ) ) (attention_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (final_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) ) (20): TransformerEncoderLayer( (dropout): Dropout(p=0.1, inplace=False) (attention): MultiheadAttention( (dropout): Dropout(p=0.1, inplace=False) (input_projection): DynamicQuantizedLinear(in_features=1024, out_features=3072, scale=1.0, zero_point=0) (output_projection): DynamicQuantizedLinear(in_features=1024, out_features=1024, scale=1.0, zero_point=0) ) (residual_mlp): ResidualMLP( (mlp): Sequential( (0): DynamicQuantizedLinear(in_features=1024, out_features=4096, scale=1.0, zero_point=0) (1): GeLU() (2): Dropout(p=0.1, inplace=False) (3): DynamicQuantizedLinear(in_features=4096, out_features=1024, scale=1.0, zero_point=0) (4): Dropout(p=0.1, inplace=False) ) ) (attention_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) (final_layer_norm): LayerNorm((1024,), eps=1e-05, elementwise_affine=True) )

In the comment, we would like to tell the users that the scale=1.0 and zero_point=0 for DynamicQuantizedLinear module above are for the output activation (to be consistent with the static quantization).

… and scale in dynamic quant linear module" The scale and zero_point are for the output activation tensor, not for the weight tensor; Differential Revision: [D18164949](https://our.internmc.facebook.com/intern/diff/D18164949/) [ghstack-poisoned]

jianyuh · 2019-10-29T05:17:32Z

Per @raghuramank100 's request, we removed the comments for scale and zero_points, and added the extra_repr in #28827.

facebook-github-bot · 2019-10-30T01:06:14Z

This pull request has been merged in b1ea19c.

jianyuh requested a review from apaszke as a code owner October 27, 2019 22:33

jianyuh mentioned this pull request Oct 27, 2019

[pt][quant] Add the warning message for API with linear modules #28766

Closed

z-a-f approved these changes Oct 28, 2019

View reviewed changes

raghuramank100 suggested changes Oct 28, 2019

View reviewed changes

raghuramank100 reviewed Oct 28, 2019

View reviewed changes

jianyuh added the oncall: quantization Quantization support in PyTorch label Oct 28, 2019

jianyuh mentioned this pull request Oct 29, 2019

[pt][quant] Avoid the misleading zero_point and scale [2/2] #28827

Closed

raghuramank100 approved these changes Oct 29, 2019

View reviewed changes

facebook-github-bot closed this in b1ea19c Oct 30, 2019

facebook-github-bot added the merged label Oct 30, 2019

facebook-github-bot deleted the gh/jianyuh/38/head branch November 2, 2019 14:17

mruberry added the Merged label Oct 28, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[pt][quant] Update the misleading comments for zero_points and scale in dynamic quant linear module #28767

[pt][quant] Update the misleading comments for zero_points and scale in dynamic quant linear module #28767

Uh oh!

jianyuh commented Oct 27, 2019 •

edited

Loading

Uh oh!

z-a-f left a comment

Uh oh!

z-a-f Oct 28, 2019

Uh oh!

raghuramank100 left a comment

Uh oh!

raghuramank100 Oct 28, 2019

Uh oh!

jianyuh Oct 28, 2019 •

edited

Loading

Uh oh!

jianyuh commented Oct 29, 2019

Uh oh!

facebook-github-bot commented Oct 30, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

[pt][quant] Update the misleading comments for zero_points and scale in dynamic quant linear module #28767

[pt][quant] Update the misleading comments for zero_points and scale in dynamic quant linear module #28767

Uh oh!

Conversation

jianyuh commented Oct 27, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

z-a-f left a comment

Choose a reason for hiding this comment

Uh oh!

z-a-f Oct 28, 2019

Choose a reason for hiding this comment

Uh oh!

raghuramank100 left a comment

Choose a reason for hiding this comment

Uh oh!

raghuramank100 Oct 28, 2019

Choose a reason for hiding this comment

Uh oh!

jianyuh Oct 28, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jianyuh commented Oct 29, 2019

Uh oh!

facebook-github-bot commented Oct 30, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

jianyuh commented Oct 27, 2019 •

edited

Loading

jianyuh Oct 28, 2019 •

edited

Loading