Skip to content

Comments

8-bit quantization in dnn module and int8 layers#20228

Merged
opencv-pushbot merged 1 commit intoopencv:masterfrom
jebastin-nadar:int8
Aug 19, 2021
Merged

8-bit quantization in dnn module and int8 layers#20228
opencv-pushbot merged 1 commit intoopencv:masterfrom
jebastin-nadar:int8

Conversation

@jebastin-nadar
Copy link
Contributor

@jebastin-nadar jebastin-nadar commented Jun 7, 2021

PR for GSoC'21 project on quantization in DNN module. This PR adds functions to quantize FP32 models and int8 versions of some layers along with tests for the new layers.

Layer Status Remarks
Convolution ✔️ Variable weights unsupported
Inner Product ✔️ Variable weights unsupported
Pooling ✔️ Only Max and Average pooling
Padding ✔️
Flatten ✔️
Activations ✔️
Concat ✔️
Eltwise ✔️ Eltwise division unsupported
BatchNorm, Scale, Shift ✔️
Data Permutation layers ✔️

A second PR is planned later this summer to load 8-bit quantized models from other frameworks (ONNX/Tensorflow) and perform inference using int8 layers and weights without converting them to FP32 (as done currently).

mentor : @vpisarev
relates : #16633 #20188

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

  • I agree to contribute to the project under Apache 2 License.
  • To the best of my knowledge, the proposed patch is not based on a code under GPL or other license that is incompatible with OpenCV
  • The PR is proposed to proper branch
  • There is reference to original bug report and related work
  • There is accuracy test, performance test and test data in opencv_extra repository, if applicable
    Patch to opencv_extra has the same branch name.
  • The feature is well documented and sample code can be built with the project CMake

@jebastin-nadar
Copy link
Contributor Author

@vpisarev MaxPool int8 tests pass now. Tests with maxpooling as the last layer are commented as they compute max indices which is not supported in int8 version.

Although I still don't get why maxpooling as the last layer should compute max index as well by default.

int numOutputs = requiredOutputs ? requiredOutputs : (type == MAX ? 2 : 1);

@vpisarev
Copy link
Contributor

vpisarev commented Jun 10, 2021

@vpisarev MaxPool int8 tests pass now. Tests with maxpooling as the last layer are commented as they compute max indices which is not supported in int8 version.

Although I still don't get why maxpooling as the last layer should compute max index as well by default.

int numOutputs = requiredOutputs ? requiredOutputs : (type == MAX ? 2 : 1);

I actually suggest to support computing indices (in FP32, as before) together with computing the max value in INT8. Yes, it will be slower, but it will let us to provide 100% compatibility.

@jebastin-nadar
Copy link
Contributor Author

support computing indices (in FP32, as before) together with computing the max value in INT8

This is not possible right now as a single variable determines the datatypes of all the outputs for a layer. So both outputs[0] and outputs[1] can either be CV_32F or CV_8S. One as CV_32F and another as CV_8S is not possible currently.

int dtype; // Datatype of output blobs.

dst.create(shape, dtype);

Ofcourse changing "dtype" to std::vector would solve it, but that will introduce a lot of complexity in allocating blobs and a lot of work for a feature which is rarely used. Maybe we can keep it as low priority and look at it later.

@jebastin-nadar
Copy link
Contributor Author

Some build issues :

  1. AVX-512 path in int8layers/layers_common.simd.hpp causes build warnings and convolution tests failure (segmentation fault). The same tests are passed locally and in some other builders so I suspect there is an issue with that specific path. Also, I cannot reproduce this locally as my CPU only supports up to AVX2.
  2. DNN Tests failure in OpenCL builders. From what I remember, I haven't modified any OpenCL related code, so don't know whats causing the failures.

@vpisarev
Copy link
Contributor

vpisarev commented Jun 12, 2021

Ofcourse changing "dtype" to std::vector would solve it, but that will introduce a lot of complexity in allocating blobs and a lot of work for a feature which is rarely used. Maybe we can keep it as low priority and look at it later.

ok, sounds good to me. Let's keep it as a low-priority item

@vpisarev
Copy link
Contributor

Some build issues :

  1. AVX-512 path in int8layers/layers_common.simd.hpp causes build warnings and convolution tests failure (segmentation fault). The same tests are passed locally and in some other builders so I suspect there is an issue with that specific path. Also, I cannot reproduce this locally as my CPU only supports up to AVX2.

I suggest to comment off AVX-512 branches for now (in your newly added code, not everywhere)

  1. DNN Tests failure in OpenCL builders. From what I remember, I haven't modified any OpenCL related code, so don't know whats causing the failures.

well, you need to figure that out. If you stuck at that, we can look at it together

@jebastin-nadar
Copy link
Contributor Author

@alalek @vpisarev How do I ensure only dnn module tests are run in CI Linux OpenCL builder. I edited my original comment but it looks like tests for all modules is being checked.

@jebastin-nadar jebastin-nadar force-pushed the int8 branch 9 times, most recently from 28d7c78 to fc8350d Compare June 29, 2021 03:34
@jebastin-nadar jebastin-nadar force-pushed the int8 branch 2 times, most recently from 1afb5b9 to 7b5a392 Compare July 7, 2021 13:43
@jebastin-nadar
Copy link
Contributor Author

@vpisarev As discussed, int8 layers which had mostly duplicated code (concat, flatten, padding) have been removed and the original fp32 layers are modified to support 8-bit inputs as well.

In some parallel_for(), I have used templates to support multiple datatypes, please check the latest commit to see if any changes have to be made.

@jebastin-nadar jebastin-nadar changed the title WIP : 8-bit quantization in dnn module and int8 layers 8-bit quantization in dnn module and int8 layers Jul 12, 2021
@jebastin-nadar jebastin-nadar marked this pull request as ready for review July 12, 2021 07:10
@vpisarev
Copy link
Contributor

@SamFC10, could you please fix the merge conflicts once again? And then squash commits? We will try to merge your pull request quickly.

@jebastin-nadar
Copy link
Contributor Author

And then squash commits

Looks like I messed up something.
Command used :

git reset --soft HEAD~38
git commit -m ""
git push -f origin int8

@alalek
Copy link
Member

alalek commented Aug 18, 2021

You can rollback changes:

git checkout -B int8 79eca09675fecd2b58c237233a0aaaa7197ace6f

@jebastin-nadar
Copy link
Contributor Author

Managed to restore my commits and squashed them. Thanks for the help @alalek @vpisarev

@vpisarev vpisarev self-requested a review August 19, 2021 08:08
@vpisarev
Copy link
Contributor

👍

@opencv-pushbot opencv-pushbot merged commit f787c49 into opencv:master Aug 19, 2021
asmorkalov pushed a commit that referenced this pull request Feb 16, 2024
dnn cleanup: On-fly-quantization removal #2498

On-fly-quantization is first introduced via #20228.
We decided to remove it but keep int8 layers implementation because on-fly-quantization
is less practical given the fact that there has been so many dedicated tools for model
quantization.

### Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

- [x] I agree to contribute to the project under Apache 2 License.
- [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
- [x] The PR is proposed to the proper branch
- [x] There is a reference to the original bug report and related work
- [x] There is accuracy test, performance test and test data in opencv_extra repository, if applicable
      Patch to opencv_extra has the same branch name.
- [x] The feature is well documented and sample code can be built with the project CMake
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants