Enable quant op to share quantization parameter between input and ouput by yufenglee · Pull Request #12408 · microsoft/onnxruntime

yufenglee · 2022-08-01T19:13:16Z

Description: Describe your changes.
Quantization parameters for some ops input and output need to be same, like Gather, Transpose, Split. This PR adds support for sharing quantization parameters between input and output.

lgtm-com · 2022-08-01T19:32:57Z

This pull request introduces 2 alerts and fixes 19 when merging 455b394 into 6bb807e - view on LGTM.com

new alerts:

2 for Unused local variable

fixed alerts:

17 for Unused import
1 for First parameter of a method is not named 'self'
1 for Unused local variable

lgtm-com · 2022-08-01T20:11:38Z

This pull request introduces 2 alerts and fixes 19 when merging d36c456 into 62922f4 - view on LGTM.com

new alerts:

2 for Unused local variable

fixed alerts:

17 for Unused import
1 for First parameter of a method is not named 'self'
1 for Unused local variable

lgtm-com · 2022-08-01T23:48:20Z

This pull request introduces 2 alerts and fixes 19 when merging 9e084ea into 315e006 - view on LGTM.com

new alerts:

2 for Unused local variable

fixed alerts:

17 for Unused import
1 for First parameter of a method is not named 'self'
1 for Unused local variable

lgtm-com · 2022-08-02T18:41:15Z

This pull request introduces 2 alerts and fixes 19 when merging e74de32 into b39257a - view on LGTM.com

new alerts:

2 for Unused local variable

fixed alerts:

17 for Unused import
1 for First parameter of a method is not named 'self'
1 for Unused local variable

onnxruntime/python/tools/quantization/onnx_model.py

justinchuby · 2022-08-03T16:21:59Z

onnxruntime/python/tools/quantization/onnx_model.py

+        """
+        Clean unused initializers including which is caused by quantizing the model.
+            return cleaned graph, and list of tensor names from this graph and all its subgraphes
+            that can not be found in this graph and its subgraphes
+        """


Suggested change

"""

Clean unused initializers including which is caused by quantizing the model.

return cleaned graph, and list of tensor names from this graph and all its subgraphes

that can not be found in this graph and its subgraphes

"""

"""Clean unused initializers including which is caused by quantizing the model.

Returns:

A cleaned graph, and a list of tensor names from this graph and all its subgraphs

that cannot be found in this graph and its subgraphs

"""

nit: formatting, wording

justinchuby · 2022-08-03T16:22:48Z

onnxruntime/python/tools/quantization/onnx_model.py

+        return ONNXModel.clean_initializers_helper(self.graph(), self.model)
+
+    @staticmethod
+    def clean_initializers_helper(graph, model):


Prefer adding type annotations for new functions for clarity (for people and type checkers)

justinchuby · 2022-08-03T16:24:22Z

onnxruntime/python/tools/quantization/onnx_model.py

+
+        new_nodes = []
+        for node in graph.node:
+            node_2_add = node


naming: Is there a name for node_2_add that conveys what it is better to readers?

justinchuby · 2022-08-03T16:24:58Z

onnxruntime/python/tools/quantization/onnx_model.py

+                for attr in node.attribute
+                if attr.type == onnx.AttributeProto.GRAPH or attr.type == onnx.AttributeProto.GRAPHS
+            ]
+            if len(graph_attrs) > 0:


Suggested change

if len(graph_attrs) > 0:

if graph_attrs:

pythonic checks

justinchuby · 2022-08-03T16:25:32Z

onnxruntime/python/tools/quantization/onnx_model.py

+            if len(graph_attrs) > 0:
+                kwargs = {}
+                for attr in node.attribute:
+                    kv = {}


readability: Avoid abbreviations - they add cognitive load to readers. Here and elsewhere

Do not use abbreviations that are ambiguous or unfamiliar to readers outside the project, and do not abbreviate by deleting letters within a word.

https://google.github.io/styleguide/pyguide.html#316-naming
https://google.github.io/styleguide/cppguide.html#General_Naming_Rules

justinchuby · 2022-08-03T16:25:46Z

onnxruntime/python/tools/quantization/onnx_model.py

+                            sub_requesting_tensor_names,
+                        ) = ONNXModel.clean_initializers_helper(attr.g, model)
+                        kv = {attr.name: cleaned_sub_graph}
+                        requesting_tensor_names.update({gn: 1 for gn in sub_requesting_tensor_names})


readability: Avoid abbreviations - they add cognitive load to readers.

Do not use abbreviations that are ambiguous or unfamiliar to readers outside the project, and do not abbreviate by deleting letters within a word.

https://google.github.io/styleguide/pyguide.html#316-naming
https://google.github.io/styleguide/cppguide.html#General_Naming_Rules

justinchuby · 2022-08-03T16:28:05Z

onnxruntime/python/tools/quantization/onnx_model.py

+
+        generated_names = {}
+        generated_names.update({output_name: 1 for node in graph.node for output_name in node.output if output_name})
+        for gn in generated_names:


readability: Avoid abbreviations - they add cognitive load to readers.

Do not use abbreviations that are ambiguous or unfamiliar to readers outside the project, and do not abbreviate by deleting letters within a word.

https://google.github.io/styleguide/pyguide.html#316-naming
https://google.github.io/styleguide/cppguide.html#General_Naming_Rules

justinchuby · 2022-08-03T16:30:30Z

onnxruntime/python/tools/quantization/operators/gather.py

+
+
+class QDQGather(QDQOperatorBase):
+    def __init__(self, onnx_quantizer, onnx_node):


Prefer adding type annotations to new code for clarity and correctness. Here and elsewhere

onnxruntime/python/tools/quantization/qdq_quantizer.py

justinchuby · 2022-08-03T16:31:56Z

onnxruntime/python/tools/quantization/qdq_quantizer.py

        )
-        self.tensors_to_quantize = []
-        self.tensors_to_quantize_per_channel = []
+        self.tensors_to_quantize = dict()


R1735: Consider using {} instead of dict() (use-dict-literal)

justinchuby · 2022-08-03T16:32:12Z

onnxruntime/python/tools/quantization/qdq_quantizer.py

        )

-    def quantize_tensor(self, tensor_name):
+    def _is_tensor_quantizable(self, tensor_name):


Suggested change

def _is_tensor_quantizable(self, tensor_name):

def _is_tensor_quantizable(self, tensor_name: str) -> bool:

will have a separate PR to update the type annotations for all functions under quantization.

onnxruntime/python/tools/quantization/qdq_quantizer.py

justinchuby · 2022-08-03T16:36:35Z

onnxruntime/python/tools/quantization/qdq_quantizer.py

+
+    def _quantize_sharing_param_tensors(self):
+        while self.tensors_to_quantize:
+            has_update = False


nit: naming: updated, has_updated

justinchuby · 2022-08-03T16:37:57Z

onnxruntime/python/tools/quantization/qdq_quantizer.py

-                    )
-                    self.quantized_value_map[tensor_name] = quantized_value
+            if not has_update:
+                raise ValueError("There is acyclic dependence in quantization parameter shared mode")


Just making sure: do you mean a cyclic dependency (a cycle)?

It'd be helpful to also include an actionable suggestion in the error message

onnxruntime/python/tools/quantization/quant_utils.py

onnxruntime/test/python/quantization/test_qdq.py

onnxruntime/python/tools/quantization/qdq_quantizer.py

lgtm-com · 2022-08-03T21:56:25Z

This pull request fixes 20 alerts when merging 2b5df48 into 77cab7a - view on LGTM.com

fixed alerts:

17 for Unused import
2 for Unused local variable
1 for First parameter of a method is not named 'self'

justinchuby · 2022-08-03T21:58:12Z

onnxruntime/python/tools/quantization/onnx_model.py

+        """Find out if a node exists in a graph or a node is in the
+        new set of nodes created during quantization.
+
+        Return:


nit: https://google.github.io/styleguide/pyguide.html#doc-function-returns

Suggested change

Return:

Returns:

justinchuby · 2022-08-03T22:02:49Z

onnxruntime/python/tools/quantization/qdq_quantizer.py

                    # Quantize the input
                    initializer = find_by_name(tensor_name, self.model.initializer())
-                    if initializer is not None:
+                    if initializer:


nit: explicit is not None is generally good

https://google.github.io/styleguide/pyguide.html#2144-decision

justinchuby

thanks!

lgtm-com · 2022-08-03T22:32:59Z

This pull request fixes 20 alerts when merging a897e79 into a3de1bb - view on LGTM.com

fixed alerts:

17 for Unused import
2 for Unused local variable
1 for First parameter of a method is not named 'self'

share quant param between tensors

455b394

add support for direct ops

d36c456

add check of force to quantize

9e084ea

refactor code

e74de32

jchen351 previously approved these changes Aug 3, 2022

View reviewed changes