Academia.eduAcademia.edu

Figure 2.2: A schematic of matrix-multiply logic in an neural network accelerator for quantized inference.  Once the three quantization parameters are defined we can proceed with the quantization operation Starting from a real-valued vector x we first map it to the unsigned integer grid {0,...,2° — 1}:

Figure 2 2: A schematic of matrix-multiply logic in an neural network accelerator for quantized inference. Once the three quantization parameters are defined we can proceed with the quantization operation Starting from a real-valued vector x we first map it to the unsigned integer grid {0,...,2° — 1}: