M7 Electronic System Level Design
Fixed Point Arithemtic
Carsten Gremzow
Fixed Point Arithmetic
Fixed Point versus Floating Point
Floating Point Arithmetic
§ After each arithmetic operation numbers are normalised
§ Used where precision and dynamic range are important
§ Most algorithms are developed in FP
§ Ease of coding
§ More Cost (Area, Speed, Power)
Fixed Point Arithmetic
§ Place of decimal is fixed
§ Simpler HW, low power, less silicon
§ Converting FP simulation to Fixed point simulation is time consuming
§ Multiplication doubles the number of bits: NxN multiplier produces 2N
bits
§ The code is less readable, need to worry about overflow and scaling
issues
M7 ESLD 2/19
Fixed Point Arithmetic
Floating Point
M7 ESLD 3/19
Fixed Point Arithmetic
Floating Point
M7 ESLD 4/19
Fixed Point Arithmetic
Floating Point
Add - typically 4 clocks:
compare, shift, add, normalize
Multiply - typically 8 clocks:
add, fixed point multiply, normalize, add
Divide - typically 20-40 clocks
M7 ESLD 5/19
Fixed Point Arithmetic
Typical System Level Design Flow
M7 ESLD 6/19
Fixed Point Arithmetic
Fixed Point versus Floating Point
Algorithms are developed in floating point format using tools like
Matlab
Floating point processors and HW are expensive
Fixed-point processors and HW are often used in embedded
systems
After algorithms are designed and tested then they are converted
into fixed- point implementation
The algorithms are ported on Fixed-point processor or application
specific hardware
M7 ESLD 7/19
Fixed Point Arithmetic
Qn.m Fixed Point Format
Qn.m format is a fixed positional number system for representing
fixed-point numbers
A Qn.m format N-bit binary number assumes n bits to the left and m
bits to the right of the binary point
M7 ESLD 8/19
Fixed Point Arithmetic
Qn.m Key Idea
in Qn.m format n entirely depends upon the range of the integer
m defines the precision of the fractional part
M7 ESLD 9/19
Fixed Point Arithmetic
Qn.m Positve Numbers
the MSB is the sign bit
for a positive fixed-point-number, MSB is ’0’:
b “ 0bn´2 . . . b1 b0 .b´1 b´2 . . . b´m
equivalent floating point value of the positive number is
b “ bn´2 2n´1 ` bn´2 2n´2 ` ¨ ¨ ¨ ` b1 21 ` b0 ` b´1 2´1 ` ¨ ¨ ¨ ` b´m 2´m
for negative numbers, MSB has neative weight and the equivalent
value is
b “ ´bn´1 2n´1 ` bn´2 2n´1 `¨ ¨ ¨` b1 21 ` b0 ` b´1 2´1 `¨ ¨ ¨` b´m 2´m
M7 ESLD 10/19
Fixed Point Arithmetic
Conversion to Qn.m
1. define total number of bits to reresent a Qn.m number
§ assume ten bits in the example
2. fix location of the decimal based on the value of the number
§ assume two bits for the integer part
§ the decimal point is implied
M7 ESLD 11/19
Fixed Point Arithmetic
Example
two bits for the integer and remaining eight bit keeps fractional part
a ten bit Q2.8 signed number covers -2 to +1.9922
increasing the fractional bits increases the precision
M7 ESLD 12/19
Fixed Point Arithmetic
Qn.m Range Determination
M7 ESLD 13/19
Fixed Point Arithmetic
The Software Side
using 16, 32 and 64 Bit Integer Types for fixed point arithmetic
§ Ñ short, int and long int in C
Converting a floating point number to fixed point:
§ Multiply the float by a power of 2 represented by a floating point
value, and cast the result to an integer:
fp_pi = (int)(3.141593f * 65536.0f); // 16 bits
fractional
§ After calculations, cast the result to int by discarding the fractional
bits. E.g.:
int result = fp_pi » 16; // divide by 65536
§ Or, get the original float back by casting to float and dividing by
2fractionalbits :
float result = (float)fp_pi / 65536.0f;
§ Note that this last option has significant overhead, which should be
outweighed by the gains.
M7 ESLD 14/19
Fixed Point Arithmetic
The Software Side
Addition and Subtraction
Adding two fixed point numbers is straightforward:
fp_a = ... ;
fp_b = ... ;
fp_sum = fp_a + fp_b;
Subtraction is done in the same way.
Note that this does require that fp_a and fp_b have the same
number of fractional bits. Also don’t mix signed and unsigned
carelessly.
fp_a = ... ; // 8:24
fp_b = ... ; // 16:16
fp_sum = (fp_a >> 8) + fp_b; // result is 16:16
M7 ESLD 15/19
Fixed Point Arithmetic
The Software Side
Multiplication
Multiplying fixed point numbers:
fp_a = ... ; // 10:22
fp_b = ... ; // 10:22
fp_sum = fp_a * fp_b; // 20:44
Situation 1: fp_sum is a 64 bit value.
§ Divide fp_sum by 222 to reduce it to 20:22 fixed point. (shift right by
22 bits)
Situation 2: fp_sum is a 32 bit value.
§ Ensure that intermediate results never exceed 32 bits.
M7 ESLD 16/19
Fixed Point Arithmetic
The Software Side
Division
Dividing fixed point numbers:
fp_a = ... ; // 10:22
fp_b = ... ; // 10:22
fp_sum = fp_a / fp_b; // 10:0
Situation 1: we can use a 64-bit intermediate value.
§ Multiply fp_a by 222 before the division (shift left by 22 bits)
Situation 2: we need to respect the 32-bit limit.
M7 ESLD 17/19
Fixed Point Arithmetic
The Hardware Side
as in software addition and subtraction operation remain the same
§ it’s the sofware’s task to perform operand conversion / scaling
§ in VHDL:
s_sum <= s_a + s_b; – beware of the carry bit..
multiplication is harder yet simpler at the same time
§ you will need the full resulting width of the multiplication operation
§ perform shifting and truncation of leading bits of the result in a
separate assignment
§ cannot be performed in a single statement
§ concurrent example in VHDL
signal s_a16, s_b16, s_prod16 : std_logic_vector(15 downto 0); -- 16 bit signed
signal s_prod32 : std_logic_vector(31 downto 0);
s_prod32 <= s_a16 * s_b16; -- generate 32 bit result
-- decimal point now between bit 15 and 16
s_prod16 <= s_prod32(23 downto 8); -- skip eight leading and trailing bits
M7 ESLD 18/19
Fixed Point Arithmetic
The Hardware Side - Pitfalls
truncating leading bits in multiplication result might jeopardize sign
information
fixed point multiplication is fast than floating point, but . . .
. . . check propagation delay of multiplication network
32x32 Bit Multiplication bound to break timing constraints with AXI
bus clock
revert to pipelined multiplication
M7 ESLD 19/19