100% found this document useful (1 vote)
278 views33 pages

Hybrid Low Radix Encoding Multipliers

This document describes a project report on hybrid low radix encoding-based approximate Booth multipliers. It includes an abstract, introduction to approximate computing and Booth multiplier algorithms, related work on radix-4 and radix-8 encoding, and proposed work on the project to approximate partial product bits in radix-8 Booth multipliers to reduce hardware requirements. Results and conclusions are also outlined.

Uploaded by

AKASHDEEP MITRA
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
278 views33 pages

Hybrid Low Radix Encoding Multipliers

This document describes a project report on hybrid low radix encoding-based approximate Booth multipliers. It includes an abstract, introduction to approximate computing and Booth multiplier algorithms, related work on radix-4 and radix-8 encoding, and proposed work on the project to approximate partial product bits in radix-8 Booth multipliers to reduce hardware requirements. Results and conclusions are also outlined.

Uploaded by

AKASHDEEP MITRA
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

A Project Report

On
Hybrid Low Radix Encoding-Based Approximate Booth Multipliers
BY
G.V.N.Karthik
2020H1230314H
Nihit Kapoor
2020H1230330H
Akashdeep Mitra
2020H1230328H

1
CONTENTS
Abstract……………………………………………………………….... 3
1.Introduction……………………………………………….………….. 4-6
2.Related work ……………………………………………….………….. 7-10
3. Proposed work ……………………………………………….………….. 11-17
4. Results ……………………………………………….………….. 18-31.
Conclusion……………………………………………….………….. 32
References……………………………………………….………….. 33

2
ABSTRACT
In this project we are going to work on Approximate Radix-8 Booth Multipliers. Generally, Booth
Multipliers are used in order to reduce partial product rows and the time required for the
multiplication operation which uses simple shift operations and addition and subtraction based on
the previous LSB and the present LSB.
The modified booth algorthims take it further deep where we decrease the number of
multiplications required to generate the partial product bits and once the partial product bits are
obtained based on the encoding we will shift the obtained partial products and place them in the
correct positions in order to carry out the accumulation.
Coming to the Approximate multiplication it is highly useful for the cases where speed
optimization is required and accuracy is not a main concern. In this project we were going to
approximate the partial products bits which comes out to be +3*c or -3*c to be either (+2*c or -2*c)
or (+4*C or -4*c). This depends on the type of approximator that we are going to use. Once the
generation of the approximate partial product bits is done then we are going to accumulate these
bits using a Wallace tree kind of structure which will also help us in reducing the delay.

3
Introduction

Approximate Computing:
It is one of the fields which was gaining a huge popularity because of its ability to reduce the area
and improve the speed of the ALU. Speed and Area are the most important constraints for any
digital block and ALU being the basic building block of a processor if we are able to improve the
speed of the operation then we can improve the performance of the whole processor which helps us
to operate at higher frequencies. The basic idea behind approximate computing is that not all the
applications require highly accurate results in some of the applications like digital signal
processing, Image processing, GPU applications we need not require accurate arthimetic results.
Using these approximate adders and multipliers we are able to reduce the area of the ALU, power
consumed by the ALU and increase the speed of operation. One of the important factor determining
the efficiency of any digital block is PDP which is also being enhanced as a result of using these
appromixate computing blocks.

Booth Multiplier Algorthim:


• The algorithm of a Booth multiplier is such that we will use simple shifters and adder to be
the multiplication.

• It is generally used for the signed integer multiplication for this it uses 2’s complement
representation of the multiplier and the multiplicand.

• Typically it proceeds from the LSB to MSB. The multiplication of 2i is then typically
replaced by incremental shifts of P accumulator to the right between steps.

• We will place the no. of bits from max{multiplicand, multiplier} in the source counter.

• In BR register we will place Multiplicand , Multiplier in QR register and 0 in Qn+1.

• We will load the Accumulator with 0 initially. On checking the QR(0)Qn+1 bits we will do
the shifting of the {Ac reg , Qr register} initially we will place 0 in Qn+1 or addition of the
value in the BR register with the AC register and shift or subtract and right shift the {Ac reg
,Qr register}.

• Upon Qn+1 becoming 0 we will come out of the booth multiplier and the final result is
obtained.

FLOW CHART for the booth multiplier algorthim:

4
Modified Booth Algorthims:
Inorder to reduce the delay further modified booth algorthims were being proposed there were
many variants in the modified booth algorthims like radix-2, radix-4, radix-8, radix-16 etc..,
In this project were working on the radix-4 and radix-8. These modified booth algorthims make the
computation much more faster when compared with conventional multipliers.
Now let’s see about radix-4 Booth multiplication.

Radix-4:
• In radix implementation of the booth multiplier we will divide the given multiplicand into
three bits group each and then accordingly we will prepare the partial products according to
the algorithm that was proposed.

• For example if the encoded radix-4 is 001 we will get the partial product as the
1*multiplicand.

• In the three bits the MSB represents the sign of the partial product and we will have thwo
types of partial products possible apart form 0 1. +1*x, 2.+ 2*x both of these can be
obtained simply by a shift operation

• We will right shift each stage by two bits with respect to the previous stage and then finally
add them to get the result.

5
Now let us see about radix-8 Booth Encoding algorthim.
Radix-8 Algorthim:
• Encoding in the radix-8 will be similar to radix-4 encoding but we need to take an extra bit
into the encoding bits i.e., we need to take a total of 4 bits into the encoded bits.
• Due to the increased bits in the encoding we will have extra partial product bits in each row
in radix-4 we have only +1*x , +2*x, 0 but when it comes to radix-8 we will have two more
products they are 0,+1*x,+2*x,+3*x,+4*x
• In radix-4 we have only 0,1*x ,2*x partial products which can be simply be obtained by
shift operation when it comes to Radix-8 we have 1*x,2*x,3*x,4*x out of which 3*x cannot
be obtained simply by a shift operation.
• In order to calculate +3*x we need a shifter followed by a adder which will result in
+2*x+1*x. So we will do 1 multiplication operation and then simply add the original
number to the product which will result in the final partial product required.
• So the hardware required to calculate the partial product bits for radix-8 will be complex
than the one for the radix-4. But when it comes to the partial product rows the number of
rows will be lesser when compared with the radix-4.
• So we need to make a compromise between these two things when ever we are going to
implement a hardware using the modified booth algorthim.
• So there was a lot of research going on these algorthims in order to reduce the hardware so
that both the delay and area required will be minimized.
• This problem will be faced by the other modified booth algorthims like radix-16, radix-64
and so on.
• So in this project we were working on one of those structures which will help us to reduce
the hardware by approximating the partial products like +3*C to +2*c or +4*c.

6
Related Work
Encoding the bits in radix-4 are done using the following truth table which shows which bits will be
represting which number.
Truth table for radix-4 encoders.
Radix -4 Sign *1 *2 PPi
encoded PP

000 0 0 0 0

001 0 1 0 1*x

010 0 1 0 1*x

011 0 0 1 2*x

100 1 0 1 -2*x

101 1 1 0 -1*x

110 1 1 0 -1*x

111 0 0 0 0

In order to generate the gate level structure from this truth table we need to take the representations
from the table and then optimize using the K-Map upon doing that we can obtain the following
structure for radix-4 encoders.
Radix-4 Encoder gate level structure:

Figure-1 : Radix-4 encoder.

Boolean Expressions:
The following expressions were obtained using K-Map optimizations by taking their values
accordingly from the truth table that was shown above
If we consider the bits that were encoded to be d2i+1, d2i, d2i-1.
The encoding bit X1 = d2i-1 (xor) d2i .
The encoding bit X2 = d2i+1.d2i’.d2i-1’+ d2i+1’.d2i.d2i-1.
7
The sign bit = d2i+1.(d2i’+d2i-1’).

Radix-8
Truth table showing the encoding of radix-8.
Radix-8 encoded Sign 1 2 3 4 PPi

0000 0 0 0 0 0 0

0001 0 1 0 0 0 1*x

0010 0 1 0 0 0 1*x

0011 0 0 1 0 0 2*x

0100 0 0 1 0 0 2*X

0101 0 0 0 1 0 3*X

0110 0 0 0 1 0 3*X

0111 0 0 0 0 1 4*x

1000 1 0 0 0 1 -4*x

1001 1 0 0 1 0 -3*x

1010 1 0 0 1 0 -3*x

1011 1 0 1 0 0 -2*x

1100 1 0 1 0 0 -2*x

1101 1 1 0 0 0 -1*x

1110 1 1 0 0 0 -1*X

1111 1 0 0 0 0 0

Radix-8 Encoder gate level structure:


8
Figure-2 : Radix-8 encoder.

Boolean Expressions:
The following expressions were obtained using K-Map optimizations by taking their values
accordingly from the truth table that was shown above
If we consider the bits that were encoded to be d2i+1, d2i, d2i-1.
The encoding bit X1 = di (xor) di-1 .
The encoding bit X2 = di+1’.di.di-1+di+2.di+1’.(di-1+di)+ di+2’.di+1.(di-1’+di’)+di+1.di’.di-1’
The encoding bit X4 = di+2’.di+1.di.di-1+ di+2.di+1’.di’.di-1’
The sign bit = di+2.di+1’+di+2.di’+di+2.di+1.di-1’

From these expression and the structure we can say that the circuit that will be required to
implement the radix-8 encoding will be much more complex than the one that we have used to
design the radix-4. So now let us the process of the accumulation that has been done in the radix-4
algorthim that will be required to obtain the final result.

Accumulation:

We can use various kinds of adders and compressors to obtain the final result but one of the most
popular and most effiecient way is to use Wallace tree adder technique in which we will use Half
adder’s and Full adder’s to accumulate the obtained partial product bits by dividing the partial
product row’s into set of three and then adding them using half adders and full adders according to
the requirement i.e., if there are three bits in the column we can use a Full adder if there are only
two bits in the column we will go with a Half adder. If there is a single bit in the column we can
simply pass on the bit.

Now once we complete the first stage of addition we will do the second stage of addition by taking
the sum and carry bits that have been generated from the previous stage according to their positions
in the partial product rows.
we will continue with process till we will be left with two rows in which we use a simple ripple
carry adder in order to calculate the sum of the final two rows.

9
Structure showing Wallace tree accumulation:
The following figure shows the accumulation that has been done for a radix-4 booth multiplier
using Wallace tree accumulation.

Figure-3: Showing Wallace tree Accumulation.

10
Proposed Worked
Now let us move on to the proposed structure and the work that has been done in order to reduce
the area and the PDP of the radix-8 multiplier by compromising with respect to the accuracy and
see the algorthim that has been proposed in place of the conventional radix-8 booth multiplier.

In the proposed structure we will use approximated radix-8 encoder for the calculation of the partial
product bits of the LSB in the multiplier and then we use the exact radix-4 encoder for the MSB in
the multiplier.

In this project we were going to implement two structures which have been proposed to reduce the
hadware overhead and increase the speed of the operation.

Let us now first look at the structure of HLRBM1.

Structure of HLRBM1:

We can see in this structure that we are having a total of 3 radix-8 encodings and four radix-4
encodings. In this the radix-8 encoders that are being used were being approximate for which the
truth tables are attached in below.
In this multiplier we will use the architecture of R8ABE1 radix-8 encoder.

Radix-4 encoding truth table:

11
In this we have used the exact radix-4 encoders that is PPi so the encoding bits will be same as in
the exact radix-4.

Approximate Radix-8 encoding truth table:

X1 = (di (xor) di-1) . (di+2 (xnor) di+1)


X2 = di+1’.di.di-1 + di+2’.di+1.di-1’+di+1.di’.(di-1’+di+2’)+di+2.di+1’.(di-1+di)
X4 = di+2(di+1.di-1.di+di+1’.di’.di-1’)
S = di+2.(di+1’+di’+di. di-1’)
We use the approximate radix-8 encoding for the LSB bits in the multiplier so that the accuracy
willnot be decreased too much. So that we will be doing a good compromise between the accuracy
and the Power delay product.
In this structure of the approximate radix-8 multiplier we will be having 7 partial product rows.
Here we approximating the +3C to be +2C in this structure because the error distance is only -1.
We will use the structure of the radix-4 and radix-8 partial product generators to produce each
single partial product bit.
Later we use the obtain product bits to generate the entire partial product row and we add the sign
bit at the last bit of the generated partial product row of thr radix-8 partial product.
We will or the sign bits of the radix – 4 generated partial product with the partial product bits
present vertically above the sign bit to reduce the hardware.
For the first row we will extend the sign bar bits and then place sign bit at the 21st bit.
From the second row we will use a sign bit bar and will extend it using two 1’s at the start.
In the third row we will ue a sign bit bar only to extend the bits.
For the radix four partial products we will use a sign bit bar and 1 to extend the partial product row.
We will place the partial product rows with a space of three bits from the row present above
because the radix-8 will take a total of 4bits so we need to take the bits with appropriate care while
arranging the bits.
After generating these bits we need to accumulate the bits using a Wallace tree adder to increase the
speed of the accumulation.

12
Structure of HLRBM-2:

We can see in this structure that we are having a total of 4 radix-8 encodings and 2 radix-4
encodings. In this the radix-8 encoders that are being used were being approximate for which the
truth tables are attached in below.
In this multiplier we will use the architecture of R8ABE2 radix-8 encoder.

In this we have used the exact radix-4 encoders that is PPi so the encoding bits will be same as in
the exact radix-4.

Approximate Radix-8 encoder:

X1 = (di (xor) di-1) . (di+2 (xnor) di+1)


X2 = di+1’.di.di-1 + di+2’.di+1.di’+di+1.di’.di-1’+di+2.di+1’.di
13
X4 = di+2’.di+1.di+di+2.di+1’.di’
S = di+2.(di+1’+di’+di. di-1’)

In this structure of radix-8 encoder we will approximate the +3C to either +4C or +2C.
So we need to again code the radix-8 bits to fit the needs.
We use the approximate radix-8 encoding for the LSB bits in the multiplier so that the accuracy
willnot be decreased too much. So that we will be doing a good compromise between the accuracy
and the Power delay product.
In this structure of the approximate radix-8 multiplier we will be having 6 partial product rows.
Here we approximating the +3C to be +2C or + 4Cin this structure because the error distance is
either -1 or 1.
We will use the structure of the radix-4 and radix-8 partial product generators to produce each
single partial product bit.
Later we use the obtain product bits to generate the entire partial product row and we add the sign
bit at the last bit of the generated partial product row of thr radix-8 partial product.
We will or the sign bits of the radix – 4 generated partial product with the partial product bits
present vertically above the sign bit to reduce the hardware.
For the first row we will extend the sign bar bits and then place sign bit at the 21st bit.
From the second row we will use a sign bit bar and will extend it using two 1’s at the start.
In the third row we will ue a sign bit bar only to extend the bits.
For the radix four partial products we will use a sign bit bar and 1 to extend the partial product row.
We will place the partial product rows with a space of three bits from the row present above
because the radix-8 will take a total of 4bits so we need to take the bits with appropriate care while
arranging the bits.
After generating these bits we need to accumulate the bits using a Wallace tree adder to increase the
speed of the accumulation.
Calculations:

14
15
16
17
Results
Verilog Code:
HLR BM-1:

18
19
20
21
22
23
24
HLRBM-2 Code:

25
26
27
28
29
HLRBM-1.

30
HLR BM-2.

LUT’s Usage:

Figure showing LUT’s usage in HLRBM-1

Figure showing LUT’s Usage in HLRBM-2.

31
Conclusion
Using these approximate arthimetic blocks we will be able to reduce the delay involved in the
computational block along with power consumption optimization. They will also help us in
reducing the area. These three are the most important parameters involved in the design of any
design. Another parameter which determines the perfomance of the design is PDP which is also
optimized due to these approximate computational blocks.
But we need to contribute in terms of the accuracy of the output. So we can use these approximate
computationla blocks only in such applications where accuracy is not an issue and requires more
optimized latency with these points keepin in mind we need to use these approximate
computational blocks.
Numbers taken Actual product HLRBM-1 HLRBM-2
2*3 6 4 8
515*17 8755 8738 8782

32
References
1. 1. Hybrid Low Radix Encoding-Based Approximate Booth Multipliers Haroon Waris,
Chenghua Wang, and Weiqiang Liu , Senior Member, IEEE
2. Approximate Radix-8 Booth Multiplier for Low Power and High Speed Applications Bipul
Boroa , K. Manikantta Reddya,∗ , Y.B. Nithin Kumara , M.H. Vasantha
3. DESIGN AND IMPLEMENTATION OF FASTER AND LOW POWER MULTIPLIERS
(A Thesis)

33

You might also like