Dr.
Jameel Ahmed
Professor and Head of Deparment of Electrical Engineering
HITEC University
Introduction
Digital systems, filters and algorithms are implemented a digital
hardware with finite word length.
Practical DSP implementation requires special attention because of
potential quantization and architecture error as well as the
processing of over flow.
A DSP processor data format determines its ability to handle signals of
different precisions, dynamic ranges and SQNR.
Data Representation & Architecture
In fixed point DSP processors, a number is represented with a series of
binary digits (1’s and 0’s).
There are many different binary number systems such as
Sign magnitude
1’s complement
2’s complement
Since 2’s complement representation gives more logical and scientific
answers, therefore it is mostly used representation in DSP systems
DSP Data Format
DSP Data
Format
Flixed Floating
Point Point
16 Bit 24 Bit 32 Bit
IEEE-754 Others
Fixed Point Format
The fixed-point format is used to represent number b/w 0 and 1.
For example:
0 1 0 1 0 0 1 1
Sign
Bit
So the resulting answer will be
(0.5+0.125+0.0156+0.0078)=>(0.6484)
Cont...
0.435
0 0 1 1 1 0 0 0
Sign
Bit
Result = ( 0.25+0.125+0.0625)=>(0.4375)
Truncation and Round off the Q7 numbers obtained in Q7 to Q5
format, also determine the truncated and Round off error.
How To Represent Signed Numbers
Plus and minus signs used for decimal numbers: 25 (or +25), -16, etc.
For computers, it is desirable to represent everything as bits.
Three types of signed binary number representations:
Signed magnitude
1’s complement
2’s complement
Signed magnitude
In each case left-most bit indicates sign:
Positive (0)
Negative (1)
Consider signed magnitude:
0 00011002= 1210 1 00011002= -1210
Sign bit Magnitude Sign bit Magnitude
One’s Complement Representation
The one’s complement of a binary number involves inverting all bits.
To find negative of 1’s complement number take the 1’s complement of
whole number including the sign bit.
0 00011002 1 11100112
Sign bit Magnitude Sign bit 1, Compliment
Two’s Complement Representation
The two’s complement of a binary number involves inverting all bits
and adding 1.
To find the negative of a signed number take the 2’s the 2’s
complement of the positive number including the sign bit.
0 00011002 1 11101002
Sign bit Magnitude Sign bit 2, Compliment
Overflow
The detection of an overflow after the addition of two binary numbers
depends on whether the considered numbers are signed or unsigned.
When two unsigned numbers are added, an overflow is detected from
the end carry out of the most significant position.
In the case of signed numbers, the leftmost bit always represents the
sign, and negative numbers are in 2’s complement form.
Cont…
When two signed numbers are added, the sign bit is treated as part of
the number and the end carry does not indicate an overflow.
Overflow example:
+70 0 1000110 -70 1 0111010
+80 0 1010000 -80 1 0110000
= +150 1 0010110 =-150 0 1101010
Cont…
An overflow cannot occur after an addition if one number is positive and
the other is negative, since adding a positive number to a negative number
produces a result that is smaller than the larger of the two original
numbers.
An overflow may occur if the two numbers added are both either positive
or negative
Integers vs. Fractional Numbers.
Different notations are used to represent different binary formats such
as Q5 and Q7 Formats .
Q5 format:
For example: (x=0.2)
0 0 0 1 1 0
Sign
Bit
0.2= (0.125+0.0625)=>(0.1875)
Cont...
Q7 Format:
For example: (x= 0.2)
0 1 0 1 0 0 1 1
Sign
Bit
(0.125+0.0625+0.0156)=>(0.203125).
Truncation & Round Off Error
Truncation Error= [Q7 format – Q5 Format]
Truncation Error=0.203125-0.1875=>(0.01562).
Round off error=[Rounded off Q7 num –Q5 format].
Round off error= 0.20-0.1875=>(0.0125).
Wordlenght
Word length
The data available for the given number of bits is called
word length.
Scale Down
If the fixed point number becomes too large for the
available wordlenght the programmer has to scale down the number.
Scale Up
If the fixed point number becomes too small for the available
wordlenght the programmer has to scale up the number.
In both cases the programmer has to keep the wordlenght and also
note that how many binary point has been shifted.
Qm.n Convention
The Qm.n convention uses m bits to represent the integer position of
the number and n bits represent the fractional portion.
Assuming N as the total number of bits ,we have N=m+n+1.
For an N bit signal number in Qm.n format ,the MSB (bN-1) in the sign
bit .
For example ,a 16 bit number that uses 1 sign bit and 15 bits for the
fractional values is called the Q0.1 format, or Q.15 or simply Q15.0
format.
Cont...
For example : (Q 15 - Q .15) Format
or
Cont…
Similarly for integers n=0 and m=N-1 the above equation is modified
as
Finally the equation is
Dynamic range
Dynamic range in dB:
It is defined as the ratio b/w the largest number
and the smallest number(excluding zero).
In fixed point integer’s representation, max is for unsigned numbers
and for signed numbers. While Min is 1 for both numbers.
Cont…
For the fractional Q 0.15 format
max for unsigned numbers.
max for signed numbers.
min for unsigned numbers.
min for signed numbers.
Therefore a 16 bit fixed point unsigned number always has a 96dB
dynamic range or 90dB for signed number regardless of the Q format
used for number representation.
Example
For a 16 bit fixed point unsigned number the dynamic range is given
by
max= =0.999984741
min= =0.00001525878906
As
So
Precision
Definition:
It is the smallest step (or difference) b/w two consecutive
numbers that can be obtained for a given number of bits.
Note that the Q0.15 format is commonly used in DSP systems and
data must be properly scaled so that their values lies b/w (-1 and
0.9999694824…...)
Table
Dynamic Range and Precision 16-bit Numbers Using Different Q Formats
Format Largest Positive Value Least Negative Value Precision
Q0.15 0.999969482421875 −1 0.00003051757813
Q1.14 1.99993896484375 −2 0.00006103515625
Q2.13 3.9998779296875 −4 0.00012207031250
Q3.12 7.999755859375 −8 0.00024414062500
Q4.11 15.99951171875 −16 0.00048828125000
Q5.10 31.9990234375 −32 0.00097656250000
Q6.90 63.998046875 −64 0.00195312500000
Q7.80 127.99609375 −128 0.00390625000000
Q8.70 255.9921875 −256 0.00781250000000
Q9.60 511.98437 −512 0.01562500000000
Q10.5 1023.96875 −1,024 0.03125000000000
Q11.4 2047.9375 −2,048 0.06250000000000
Q12.3 4095.875 −4,096 0.12500000000000
Q13.2 8191.75 −8,192 0.25000000000000
Q14.1 16383.5 −16,384 0.50000000000000
Q15.0 32,767 −32,768 1.00000000000000
Cont...
Problem:
Most fixed point DSP uses 2’s complement fractional
numbers in different Q format. However, assembler only recognizes
integer values.
Therefore ,following steps are used to convert into the Q format into
an integer that can be recognized by assembler.
Trade off b/w dynamic range and precision.
Steps
Normalize the fractional number to the range determined by the
desired Q format.
Multiply the normalized fractional number by where the ‘n’ is the
total number of fractional bits.
Round off the product to the nearest integer.
Example
Convert (1.18) into integer using Q0.15 format.
Solution:
Step1: Normalize the number to the range b/w .
Thus
Step2: Multiplying 0.59 by where n=15
Thus
=(0.59*32,768)(19,333.12)
Step3: Round off the value
(19,333.12)=19,333=>(4B85) hex
Conclusion
The same result can interpreted in fractional by dividing by
e.g
Therefore we can see that there is some error because of rounding off
step 3.