0% found this document useful (0 votes)
51 views13 pages

FIR Filter Design

Uploaded by

sindhu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
51 views13 pages

FIR Filter Design

Uploaded by

sindhu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Microprocessors and Microsystems 71 (2019) 102883

Contents lists available at ScienceDirect

Microprocessors and Microsystems


journal homepage: www.elsevier.com/locate/micpro

Low power and low area VLSI implementation of vedic design FIR
filter for ECG signal de-noising
M. Sumalatha a,∗, P.V. Naganjaneyulu b, K. Satya Prasad c
a
ECE Department, Sai Spurthi Institute of Technology, B.Gangaram, Telangana 507303, India
b
Sri Mittapalli College of Engineering, Tummalapalem, NH16, Guntur, Andhra Pradesh, India
c
Rector of Vignan’s Foundation for Science, Technology & Research, Guntur, Andhra Pradesh, India

a r t i c l e i n f o a b s t r a c t

Article history: In recent years, Finite Impulse Response (FIR) filter plays a major role in signal processing applications.
Received 28 May 2019 Earlier many research papers are described the different types of FIR filter design. But, none of the pa-
Revised 7 August 2019
per explained about signal denoising application with an effective multiplier design. In this paper, Vedic
Accepted 29 August 2019
Design - Carry Lookahead Adder FIR filter architecture is introduced to perform the FIR filter operation
Available online 30 August 2019
with Electro Cardiogram (ECG) signal de-noising application. By usingthe MATLAB program, the input
Keywords: ECG signal is read and Additive White Gaussian Noise (AWGN) is added to the input signal. The denois-
Carry look ahead-adder ing process is implemented in Verilog and the obtained output is written in text files. For de-noising the
Electrocardiogram signal, the binary text values are read in MATLAB. With the help of Verilog code, FPGA performance (LUT,
Finite impulse response flip flop, slices, and frequency) and ASIC performance (area, power, and delay) are evaluated. For ASIC
Signal processing implementation, 180 nm and 45 nm technology are used and for FPGA implementation Virtex-4, Virtex-5,
Vedic design
and Virtex-6 devices are used to evaluate the performance. The Mean Square Error (MSE), Bit Error Rate
(BER), and Signal to Noise Ratio (SNR) performance are calculated from the de-noised signal. In 180 nm
technology, 42.39% of the area, 29.53% of delay, 43.89% of APP, 70.41% of ADP reduced in VD-CLA-FIR. In
45 nm technology, 13.2% of the area, 32.25% of the delay, 24.37% of APP, and 39.02% of ADP reduced in
VD-CLA-FIR method compared to the conventional methods.
© 2019 Elsevier B.V. All rights reserved.

1. Introduction There are a number of existing architectures such as FIR filter


design using system generator [9], normal FIR filter [10], Decima-
Nowadays, FIR filter is a major building block in signal pro- tor FIR filter [11], linear-phase FIR filter [12], GA based FIR filter
cessing application. Normally, the FIR filter is implemented in the [13], parallel based FIR filter [14], low power multiplier FIR filter
transposed direct form [1]. The FIR filter is implemented in FPGA [15], and DA based FIR filter [16]. All the architecture has more
to evaluate the hardware utilization of the entire architecture [2]. hardware utilization and less efficiency. Moreover, these existing
Multiplier plays an important role in FIR filter design. If the FIR fil- architectures not concentrated on the applications. To overcome
ter contains more multipliers, total architecture required more area this problem, VD-CLA-FIR filter method is introduced in this pa-
to perform the filter operation. Hence, the reduction of the area per. Input and co-efficient are required to perform the processing
of multipliers is a significant task in the FIR filter design [3]. The element. Those two inputs are multiplier with the help of Vedic
FIR filter provides several benefits like computational efficiency in multiplier. In the accumulator module, CLA is used to perform the
multi-rate applications, attainable linear phase response and de- addition process. Due to the usage of VD, and CLA, the FPGA, and
sirable numerical property to perform finite precision and frac- ASIC performances are improved in the proposed method than
tional arithmetic [4,5]. The digital multi-standard RFIR filter is im- conventional methods. This architecture is applicable to the ECG
plemented in wireless applications to decrease Bit Error Rate (BER) signal denoising process. MATLAB is used to read the ECG signal
[6,7]. The Discrete FIR filter is used to design an efficient filter with and showing the de-noised signal. Finally, all the performance is
low-power consumption and high-performance [8]. evaluated and tabulated in the resulting section.


Corresponding author.
E-mail address: [email protected] (M. Sumalatha).

https://doi.org/10.1016/j.micpro.2019.102883
0141-9331/© 2019 Elsevier B.V. All rights reserved.
2 M. Sumalatha, P.V. Naganjaneyulu and K.S. Prasad / Microprocessors and Microsystems 71 (2019) 102883

This research work is composed as follows, Section 2 presents Solutions: To overcome the above mentioned drawbacks, VD-
an extensive survey on recent papers based on FIR filter design. CLA-FIR architecture is implemented in this paper. The proposed
Section 3 briefly described problem statement and Section 4 ex- Vedic multiplier is designed using Urdhava Tiryakbhagyam Sutra
plained the FIR architecture by using VD algorithm with CLA ap- (Vertically and Crosswise Algorithm).In this algorithm, the partial
proach. In Section 5, comparative experimental result of proposed product and their sum are calculated in parallel and because of
VD-CLA-FIR filter design and conventional methods is presented. this, the multiplier is independent of the clock frequency of the
The conclusion of this work made in Section 6. processor. This method is implemented in the FIR design method-
ology for enhancing filtering functions in eliminating unwanted
noise. the Vedic Multiplier based FIR filter design would be a good
2. Related work
choice for high-speed DSP applications. This VD-CLA-FIR architec-
ture helps to perform the ECG denoising application. In PE, Vedic
Researchers have suggested several methods on the FIR archi-
multiplier is used to perform the multiplication operation which
tecture. This section presents a brief evaluation of some significant
helps to reduce the hardware utilization. The optimal CLA adder is
contribution to the field of FIR architecture.
also used to perform the FIR filter. Finally, the process speed and
Author Methodology Advantage Disadvantage area is improved in this method.
Chen et al. [17]. A new cost aware This proposed High computational
Sensitivity Driven method reduced complexity 4. Vedic- CLA - FIR filter methodology
Algorithm (SDA) is 20.9% of the area
used for the design and 72.7% of the
of FIR filter power based on the Step 1: The overall block diagram is shown in Fig. 1. From MAT-
different size of FIR LAB, ECG signal is read from the arrhythmia data base which
filter design
Doss, A pipelined A high throughput This method used a
is taken from an internet source.
Soundararajanm, architecture used rate achieved by the faster bit clock for Step 2: In that ECG signal, White Gaussian Noise (WGN) is
and Narasimha for adaptive FIR updated LUT. This carrying save added. Here, noise density will be 0.1 to 0.5. In MATLAB,
Murthy [18] filter design based method reduced the accumulation, but
“awgn” function is used to add the noise from the input sig-
on the DA algorit sampling period and that used a very
hm. area complexity slower clock for nal.
remaining FIR filter Step 3: With the help of dec2bin MATLAB function, the WGN
operations
with ECG signals are converted into the binary format.
Park et al. [19] A low power, high The throughput rate Low speed for all
throughput, low of the FIR filter other filter Step 4: That binary signal is written in text file (Ex.
area adaptive FIR increased by using operations Noisy_ECG_signal.txt).
filter design based parallel Lookup Table Step 5: This Noisy_ECG_signal.txt file is given to the input of
on the DA algorit (LUT). The less
hm power consumption Verilog. Normally, the input value has been generated ran-
achieved by using domly. But, now this noisy binary value is considered as an
carry save input which is going to be stored in RAM. Co-efficient is
accumulation (CSA)
for faster bit clock
stored in ROM.
Martin Kumm, The proposed work The experimental Increase in the Step 6: In the Verilog, the noise is reduced by using Vedic
Konrad Moller, and compared the two results of FIR filter number of slice Design-Carry Look-ahead Adder-FIR (VD-CLA-FIR) filter. In
Peter Zipf [20] FIR architectures DA architecture achieved count
and LUT a reconfiguration
this work, VD-CLA-FIR filter architecture is designed which
multiplication time of less than 100 is shown in Fig. 2.
technique for Field nano seconds (ns) Step 7: Each and every clock cycle accumulator result is stored
Programmable Gate
in a text file. (Ex. Filter_output.txt). By using the Verilog
Array (FPGA)
Ramanathan, A low power The pipelined DA This technique used code, it helps to measure the ASIC performances (area,
Anand, Reddy, and adaptive FIR filter table reduced the carry save power, delay) and FPGA performances (LUT, flip flops, slices,
Sridevi [21] designed based on switching activity accumulator for FIR
and frequency).
the DA algorithm. and power filter architecture, it
The Least Mean consumption occupied more area Step 8: This Filter_output.txt file is given to the MATLAB for get-
Square (LMS) in the FIR filter ting de-noised ECG signal.
algorithm employed design Step 9: From that de-noised ECG signal, MSE, SNR, and BER are
to reduce the Mean
Square Error (MSE) calculated.
between filter
output and the 4.1. Vedic design CLA based FIR filter architecture
desired response.

The block diagram of the VD-CLA-FIR filter architecture is


3. Problem statement shown in Fig. 2 which consists of address generator, accumulator,
clock generator, register, Read Only Memory (ROM), Random Ac-
This section describes the problem statement of FIR design and cess Memory (RAM), and control unit. The clock signal is gener-
also explained how the proposed methodology gives solution to ated by the clock generator. The coefficient data is stored in the
the described problems. The concerns of FIR architectures are de- ROM and noisy data input is stored in the RAM. From the control
tailed below: unit, the clock signal is provided to the filter for calculating filter
output and the reset signal is used to reset the registers in the fil-
• Less efficient multiplier gives less efficient operation of the ter block.
FIR filter The address generator generates the data addresses. It helps to
• Different structure of PE has been used which require more read the data from ROM to perform the calculation of filter oper-
power to synthesize. ation. Input is read from RAM as well as co-efficient is read from
• Normal adder occupies more hardware utilization ROM to perform the filter operation. The noisy data value is con-
• After designing FIR filter, no one focused about FIR filter sidered as input (0, 255, etc.) which multiply with co-efficient (213,
with ECG signal application. This is one of the major prob- 18, etc.) and gives the FIR filter results (y = 0). The co-efficient is
lems. taken from MATLAB FDA tool. PE is designed with the help of Vedic
M. Sumalatha, P.V. Naganjaneyulu and K.S. Prasad / Microprocessors and Microsystems 71 (2019) 102883 3

Fig. 1. Overall block diagram.

Fig. 2. VD-CLA-FIR filter operation.

Fig. 3. Block diagram of the PE.


4 M. Sumalatha, P.V. Naganjaneyulu and K.S. Prasad / Microprocessors and Microsystems 71 (2019) 102883

Step 1: Multiply Least Significant Bit (LSB) of the multiplicand


and the multiplier vertically, which gives the final outcome
of the LSB.
Step 2: Multiply the LSB of multiplicand with the Most Signif-
icant Bit (MSB) of multiplier while the MSB of multiplicand
with the LSB of the multiplier (crosswise) and add the prod-
Fig. 4. The 2X2 multiplication by using a Vedic multiplier. ucts. The addition process gives the second bit of the final
outcome.
Step 3: Multiply the MSB of the multiplicand and the multi-
plier (vertically). The product is added to the previous carry
which is already obtained in previous steps. The resulting
design which is used to reduce the hardware utilization. In ac-
sum and carry are measured as the third and fourth bit of
cumulator, initially, register contains zero. This register value per-
the final outcome. Fig. 4 shows the diagram of the 2 × 2
forms the addition with PE output which is stored in the same reg-
multiplication by using Vedic multiplier.
ister. Finally, the output is delivered from the register.
The common form of the N- tap FIR filter is defined in Eq. (1).
Vedic multiplier Algorithm
Input: “a” and “b” 2 input (4 bit)

N−1
y (n ) = h ( k )x ( n − k ) (1) Output: “Q” output (8 bit)
Stage: 1
k=0 1. Four 2 × 2 Vedic multiplier required
2. 1st Multiplier – a0 , a1 x b0 , b1
Here,n = 0, 1, 2, ....etc. y(n) is represented as the output of the FIR 3. 2nd Multiplier – a2 , a3 x b0 , b1
filter, is the coefficient of the FIR filter. x(n − k ) is represented as 4. 3rd Multiplier – a0 , a1 x b3 , b2
the number of the input sequence. 5. 4th Multiplier – a2 , a3 x b3 , b2
The Processing Element (PE) is one of the major blocks in FIR Stage: 2
6. Three adder is required
filter design which is shown in Fig. 3. Based on the PE, the re- 7. 1st adder => assign c1 = q0 + q1
maining block operates at a good performance level. An input and 8. 2nd adder => assign c2 = q2 + q3
coefficients are multiplied and processed by the Vedic design. The 9. 3rd adder => assign c3 = c1 + c2
output of VD and the co-efficient are given to the compliment cir- 10. End module
cuit which output is given to the accumulator.
The 4X4 Vedic Multiplier of the block diagram is shown in
Fig. 5. According to this diagram, the Verilog code is written to
4.1.1. Vedic design algorithm verify the results. This block contains four 2x2 multiplier block and
Fig. 4 shows 2 × 2 multiplication by using a Vedic multiplier. three adder block. In this diagram, a0 to a3 and b0 to b3 are repre-
The Vedic Multiplier architecture is high-speed compared to the sented as four bit input value.
existing multipliers. The Vedic multiplier is applied in all types Initially, the Least Significant Bit (LSB) of the two input (a0 ,a1
of numeric schemes. To illustrate the Urdhva Tiryakbhayam mul- and b0 , b1 ) is given to the input of 2X2 multiplier block to perform
tiplication. Consider two binary numbers multiplicand (a1, a0) and multiply operation. In the 2nd stage,a2 ,a3 and b0 , b1 , 3rd stage a0 ,a1
multiplier (b1, b0). Hence, the results after the multiplication pro- and b2 , b3 , at the final stage a2 ,a3 and b2 , b3 values are performed
cess of binary numbers give 4-bit of output. Generally, Vedic mul- the 2X2 multiplier operation. Last two stage multiplier value is
tiplier follows the below steps, stored in one adder as well as first two-stage multiplier value is

Fig. 5. 4 × 4 Vedic multiplier block diagram.


M. Sumalatha, P.V. Naganjaneyulu and K.S. Prasad / Microprocessors and Microsystems 71 (2019) 102883 5

Table 1
Experimental results of the ASIC performance of an existing and VD-CLA-FIR method.
∗ ∗
Technology Method Length Area (um2 ) Power (nW) Delay (ps) APP (um2 nW) ADP (um2 ps)

180 nm Existing- I [16] 8- tap 80,368 1,890,454 8274.1 151,932,007,072 664,964,832


16- tap 72,541 2,952,824 8326.3 214,200,805,784 603,976,366
32-tap 124,587 5,124,798 13,547 638,483,245,802.1 1,687,780,089
Existing -II [7] 8- tap 51,461 1,808,977 8073.4 93,091,765,397 415,444,653.4
16- tap 67,221 2,852,929 8137.3 191,776,780,641.6 546,997,443.3
32-tap 105,490 4,674,727 11,444 493,136,951,230 1,207,227,560
Existing- III [10] 8- tap 48,754 1,795,796 7845.2 89,989,938,184 382,475,130
16- tap 63,457 2,742,687 7984.2 174,042,688,959 506,640,688
32-tap 98,745 4,257,915 9842.6 420,447,836,424 971,848,290
LC-CSLA –FIR [8] 8- tap 43,446 1,706,552 7343.6 74,142,858,192 319,050,045.6
16- tap 57,396 2,617,388 7467.9 150,227,601,648 428,627,588.4
32-tap 84,663 3,736,800 7691.1 316,368,698,400 651,151,599.3
R8-CLA-FIR [22] 8- tap 38,637 1,015,734 3535 39,244,914,558 136,581,795
16- tap 56,437 1,126,414 3491 63,571,426,918 197,021,567
32-tap 68,670 1,407,048 1388 96,621,986,160 95,313,960
VD-CLA-FIR 8- tap 10,666 1,269,823 2846 13,543,932,118 30,355,436
16- tap 34,505 1,431,478 2109 49,393,148,390 72,771,045
32-tap 57,724 1,754,214 2112 101,260,248,936 121,913,088
45 nm Existing -I [16] 8- tap 5248 186,239 3147 977,382,272 16,515,456
16- tap 6984 298,453 3420.4 2,084,395,752 23,885,280
32-tap 12,473 461,847 5264.4 5,760,617,631 65,657,872
Existing-II [7] 8- tap 4761 166,514 2998.0 792,773,154 14,273,478
16- tap 6708 264,621 3050.4 1,775,077,668 20,459,400
32-tap 10,994 440,997 5097.4 4,826,273,510.016 55,781,568.6
Existing-III [10] 8- tap 4287.6 159,845 2887.2 685,255,515 12,367,908
16- tap 6212.2 256,987 2863.9 1,596,454,641.4 17,785,528.6
32-tap 9546.2 391,475 3847.2 3,737,098,645 36,724,231.4
LC-CSLA -FIR [8] 8- tap 3699.96 151,215 2726.4 234,590,538.81 10,087,570.944
16- tap 5358.47 247,354 2692.1 1,325,441,667.615 14,423,736.087
32-tap 8330.08 373,019 2836.6 3,107,248,270 23,623,880.928
R8-CLA FIR [22] 8- tap 3160.4 71,408.4 1536.7 225,679,107.36 4,856,586.68
16- tap 4616.17 55,475 1485 256,082,030.75 6,855,012.45
32-tap 5616.78 57,752 643.7 324,380,278.56 3,615,521.286
VD-CLA-FIR 8- tap 2302 74,143 602.4 170,677,186 1,385,804
16- tap 4191 76,142 1018 319,111,122 4,266,438
32-tap 5423.2 79,541 614 431,350,843 3,329,722

Table 2
Reduction percentage of the ASIC performances for the VD-CLA-FIR method.

Technology Window Reduction% of the area Reduction% of delay Reduction% of APP Reduction% of ADP

180 nm 8-tap 72.39 19.49 65.48 77.77


16-tap 38.86 39.58 22.30 63.06
32-tap 15.94 – – –
Average 42.39 29.53 43.89 70.41
45 nm 8-tap 27.15 60.80 24.37 71.46
16-tap 9.02 31.44 – 37.76
32-tap 3.43 4.51 – 7.90
Average 13.2 32.25 24.37 39.02

stored in one more adder. That two adder’s results are given as area and power dissipation. The 16-bit CLA consists of four 4-bit
the input to the final adder. Finally, 8-bit results are delivered in CLA blocks and a carry generator which is shown in Fig. 7.
the output of the Vedic Multiplier design. This is the design for A 4-bit CLAs is required to construct the 16-bit CLA to operate
4-bit Vedic multiplier which is instantiated 2 times and produce all the P and G of internal signals. The CLA adders are commonly
8-bit Vedic design operation which is used for FIR filter architec- implemented as 4-bit modules, which is used to build large size
ture. adders. Overall, the power, delay, and area can be minimized in
the VD-CLA-FIR method by using 16-bit CLA.

4.2. Carry look-ahead adder design


5. Experimental results and discussion
In this work, 16-bit CLA is used in the VD-CLA-FIR filter de-
sign instead of the normal adder, which is shown in Fig. 6. In the In this section, the experimental result and proposed method-
accumulator module, this CLA is used to improve the system per- ology are discussed effectively and also described the experimen-
formances. This adder achieved fast arithmetic operation in a dif- tal set-up and performance measure. The performance of the pro-
ferent type of data processing methods, which is used for reducing posed methodology is evaluated by ASIC and FPGA performances.
6 M. Sumalatha, P.V. Naganjaneyulu and K.S. Prasad / Microprocessors and Microsystems 71 (2019) 102883

Table 3
Experimental results of the FPGA performances for the existing and VD-CLA-FIR method.

Target FPGA Circuit LUT Flip-flop Slice RAM ROM Frequency(MHz)

Virtex4 xc4vfx12 Existing-I [16] 8-tap 536/10,944 48/10,944 279/5472 1 2 51.25


16-tap 526/10,944 45/10,944 278/5472 1 2 62.36
32-tap 564/10,944 46/10,944 294/5472 1 2 42.21
Existing- II [7] 8-tap 516/10,944 42/10,944 269/5472 1 2 83.858
16-tap 518/10,944 43/10,944 268/5472 1 2 83.169
32-tap 553/10,944 42/10,944 288/5472 1 2 78.539
Existing-III [10] 8-tap 420/10,944 40/10,944 209/5472 1 2 57.21
16-tap 494/10,944 42/10,944 248/5472 1 2 56.21
32-tap 524/10,944 41/10,944 248/5472 1 2 63.54
LC-CSLA-FIR [8] 8-tap 365/10,944 38/10,944 196/5742 1 1 65.852
16-tap 379/10,944 39/10,944 198/5742 1 1 61.267
32-tap 445/10,944 40/10,944 230/5472 1 1 64.875
R8-CLA-FIR [22] 8-tap 97/10,944 34/10,944 53/5742 1 1 146.07
16-tap 107/10,944 35/10,944 57/5742 1 1 146.07
32-tap 126/10,944 37/10,944 68/5742 1 1 146.07
VD-CLA-FIR 8- tap 73/10,944 24/10,944 48/5742 1 1 248.463
16- tap 66/10,944 31/10,944 52/5742 1 1 152.616
32-tap 83/10,944 32/10,944 61/5742 1 1 152.616
Virtex5 xc5vlx20T Existing-I [16] 8-tap 286/12,480 38/12,480 125/3120 1 2 55.98
16-tap 288/12,480 42/12,480 105/3120 1 2 54.39
32-tap 381/12,480 44/12,480 189/3120 1 2 75.28
Existing- II [7] 8-tap 269/12,480 36/12,480 111/3120 1 2 100.441
16-tap 274/12,480 39/12,480 99/3120 1 2 88.790
32-tap 373/12,480 41/12,480 162/3120 1 2 104.980
Existing-III [10] 8-tap 215/12,480 35/12,480 88/3120 1 2 98.56
16-tap 222/12,480 38/12,480 78/3120 1 2 92.45
32-tap 305/12,480 40/12,480 104/3120 1 2 88.33
LC-CSLA-FIR [8] 8-tap 168/12,480 34/12,480 45/3120 1 1 91.654
16-tap 197/12,480 36/12,480 54/3120 1 1 92.336
32-tap 181/12,480 38/12,480 49/3120 1 1 88.975
R8-CLA-FIR [22] 8-tap 47/12,480 24/12,480 15/3120 1 1 138.679
16-tap 47/12,480 35/12,480 16/3120 1 1 138.609
32-tap 50/12,480 37/12,480 17/3120 1 1 138.609
VD-CLA-FIR 8- tap 32/12,480 24/12,480 10/3120 1 1 285.091
16- tap 42/12,480 31/12,480 14/3120 1 1 231.865
32-tap 63/12,480 32/12,480 15/3120 1 1 231.591
Vitex6 xc6vcx75t Existing-I [16] 8-tap 345/46,560 54/93,120 125/11,640 1 2 107.56
16-tap 398/46,560 44/93,120 135/11,640 1 2 121.2
32-tap 446/46,560 52/93,120 152/11,640 1 2 115.65
Existing- II [7] 8-tap 325/46,560 50/93,120 117/11,640 1 2 127.677
16-tap 369/46,560 42/93,120 127/11,640 1 2 115.873
32-tap 417/46,560 48/93,120 141/11,640 1 2 117.233
Existing-III [10] 8-tap 305/46,560 44/93,120 98/11,640 1 2 115.6
16-tap 248/46,560 39/93,120 78/11,640 1 2 117.23
32-tap 266/46,560 42/93,120 89/11,640 1 2 110.2
LC-CSLA-FIR [8] 8-tap 184/46,560 34/93,120 53/11,640 1 1 115.391
16-tap 186/46,560 36/93,120 56/11,640 1 1 112.626
32-tap 186/46,560 39/93,120 56/11,640 1 1 112.687
R8-CLA-FIR [22] 8-tap 46/46,560 21/93,120 14/11,640 1 1 177.98
16-tap 46/46,560 22/93,120 13/11,640 1 1 177.98
32-tap 54/46,560 23/93,120 16/11,640 1 1 178.4
VD-CLA-FIR 8- tap 34/46,560 24/93,120 11/11,640 1 1 234.731
16- tap 40/46,560 26/93,120 12/11,640 1 1 213.412
32-tap 56/46,560 21/93,120 15/11,640 1 1 184.190

5.1. Experimental setup The input ECG signal was read from MATLAB which is shown in
Fig. 8. In the ECG signal, WGN is added which is shown in Fig. 9.
The proposed approach was experimented using 4GB RAM with This noisy data is converted to binary which is written in text files.
3.30 GHz, i3 processor, and 500GB hard disk. The architecture has This text file is given to the Verilog RAM which is considering as
been implemented using Verilog language. MATLAB was used to one of the inputs of the FIR filter.
read the ECG signal and adding WGN. Modelsim 10.5 tool was This Table represents a comparison of the Existing – I, Existing–
used to write a Verilog code and verifying the timing diagram. Xil- II, Existing-III, LC-CSLA-FIR, R8-CLA-FIR, and VD-CLA-FIR. These six
inx 14.4 was used for evaluating FPGA performances like LUT, flip methods were implemented using Verilog and that method’s out-
flop, slices, and frequency. Cadence RTL compiler version 13.1 was puts are tabulated in Table 1. According to the VD-CLA-FIR method,
used to calculate ASIC performances like area, power, and delay. both FPGA and ASIC outputs focused to optimize the hardware uti-
BER, MSE, and SNR are represented as 248.50, 0.141, and 0.1743 re- lization. Table 1 contains a different type of length (tap) like 8-tap,
spectively for input signal without noise and 630.111, 69.711, and 16-tap, and 32-tap. If the number of transistors is optimized, it of-
−0.6543 for Input signal With Noise. fers a better output of the FIR filter design. So, 45 nm technology is
M. Sumalatha, P.V. Naganjaneyulu and K.S. Prasad / Microprocessors and Microsystems 71 (2019) 102883 7

Fig. 6. Block diagram of the 16-bit CLA design.

Fig. 7. 4 -bit carry look-ahead adder.

significantly preferable to reduce area, power, and delay. In 45 nm less area, and delay compared to the existing methods. In Figs. 10–
8- tap architecture, the proposed method contains 2302 um2 area, 12, the first three different taps (8-tap, 16-tap, and 32-tap) repre-
74,143 nW power, and 602.4 ps delay. sents as 180 nm technology as well as other three taps represents
Figs. 10–12 show the comparison graphs of area, power and as 45 nm technology.
delay performance for existing and VD-CLA-FIR method. From the Table 2 presents the reduction percentage of area, delay, APP,
comparison graph, it’s clear that the VD-CLA-FIR method consumes and ADP for 8-taps, 16-tap, and 32-tap respectively. This FIR archi-
8 M. Sumalatha, P.V. Naganjaneyulu and K.S. Prasad / Microprocessors and Microsystems 71 (2019) 102883

tecture’s output has been taken in both 180 nm and 45 nm tech-


nology. In 180 nm technology, 42.39% of the area, 29.53% of de-
lay, 43.89% of APP, 70.41% of ADP reduced in VD-CLA-FIR. In 45 nm
technology, 13.2% of the area, 32.25% of the delay, 24.37% of APP,
and 39.02% of ADP reduced in VD-CLA-FIR method compared to
the conventional method.
Table 3, shows the comparison of the FPGA performances based
on different devices like Virtex-4, Virtex-5, and Virtex-6. From the
Table 3, it is concluded that the number of LUT, flip-flop, and slices
are reduced in VD-CLA-FIR method when compared to the existing
methods. Due to the reduction of those parameters, the area is re-
duced in the FIR filter design. The Figs. 13–15 shows the compari-
son graph of FPGA performances such as LUT, flip flop, and slices,
that results has been taken for different FPGA devices. From the
comparison graphs of 13, 14, and 15, it is clear that the FPGA per-
formance parameters improved in the VD-CLA-FIR method com-
pared to the existing methods. In Figs. 13–15, the first three differ-
ent taps (8-tap, 16-tap, and 32-tap) represented as Virtex 4, second
three different taps represented as Virtex 5 and third different taps
are represented as Virtex 6.
Fig. 8. Input ECG signal. Fig. 16 shows the output waveform of the VD-CLA-FIR method.
That output is taken from the Modelsim. The noisy data value is
considered as input (0, 255, etc.) which multiply with co-efficient
(213, 18, etc.) and gives the FIR filter results (y = 0). The co-efficient
is taken from MATLAB FDA tool. In the initial stage, an Acc has
zero value that is added to Y. FIR filter results stored in the next
clock cycle. At the final stage, the FIR output is delivered from Acc.
After performing a FIR filter, the final output is written in a text
file which is given to the MATLAB for getting the de-noised ECG
signal. The de-noised ECG signal is shown in Fig. 17.

5.2. Performance measure

5.2.1. Mean Square Error (MSE)


It is defined as the difference between the input signal and de-
noised output signal which is given in Eq. (2).

1  2
n
MSE = Ai − Aˆ i (2)
n
i−1

Fig. 9. Noise added ECG signal.


Here, Ai = Input signal; Aˆ i = De-noised output signal

Fig. 10. Area performance of existing and VD-CLA-FIR methods.


M. Sumalatha, P.V. Naganjaneyulu and K.S. Prasad / Microprocessors and Microsystems 71 (2019) 102883 9

Fig. 11. The power performance of existing and VD-CLA-FIR methods.

Fig. 12. Delay performance of existing and VD-CLA-FIR methods.

Fig. 13. LUT performance of existing and VD-CLA-FIR methods.


10 M. Sumalatha, P.V. Naganjaneyulu and K.S. Prasad / Microprocessors and Microsystems 71 (2019) 102883

Fig. 14. Flip Flop performance of existing and VD-CLA-FIR methods.

Fig. 15. Slice performance of existing and VD-CLA-FIR method.

Fig. 16. The output waveform of VD-CLA-FIR Method.


M. Sumalatha, P.V. Naganjaneyulu and K.S. Prasad / Microprocessors and Microsystems 71 (2019) 102883 11

Fig. 17. De-noised ECG signal.

5.2.2. Signal to Noise Ratio (SNR)


This is expressed as a logarithmic scale. The ratio of the output
signal and the input signal is represented as SNR which is given in
Eq. (3).
  Fig. 18. RTL schematic for 8 tap FIR filter.
Out put signal
SNR = 10log10 (3)
input signal
has less BER, MSE, and SNR. These performances are evaluated by
using MATLAB software.
5.2.3. Bit Error Rate (BER)
The Bit Error Rate (BER) is the number of bit errors per unit Parameter Input signal without noise Input signal with nose
time. The bit error ratio is the number of bit errors divided by
BER 248.50 630.111
the total number of transferred bits during a studied time interval.
MSE 0.141 69.711
The ration between a number of errors and the total number of SNR 0.1743 dB −0.6543 dB
bits sent is called as BER which is given in Eq. (4).
Fig. 18 shows the RTL schematic of the 8-tap FIR filter of the
Number o f errors
BER = (4) VD-CLA-FIR method, it is taken from the Cadence RTL compiler.
T otal number o f bits sent The Cadence RTL compiler was employed to convert RTL Verilog
With the help of these equations, the BER, MSE, and SNR are code to gate level Verilog code. The Verilog code read by employ-
calculated for Input signal With Output signal (IWO) as well as In- ing Tcl file and respective libraries set in the Tcl file. After syn-
put signal With Noisy signal (IWN). The de-noised output signal thesizing, area, power and delay results showed in cadence, which
is represented in Fig. 19 for verification purpose. Additionally, the

Fig. 19. Area, power, and delay of 180 nm for 8-tap FIR filter.
12 M. Sumalatha, P.V. Naganjaneyulu and K.S. Prasad / Microprocessors and Microsystems 71 (2019) 102883

Fig. 20. FPGA performance of Virtex 4 for 8 tap FIR filter.

FPGA performance of Virtex 4 for 8 taps is shown in Fig. 20, it [3] Chia-Yu Yao, Wei-Chun Hsia, Yung-Hsiang Ho, "Designing hardware-efficient
clearly shows that the LUT, flip-flop, and slice minimized in VD- fixed-point FIR filters in an expanding subexpression space, IEEE Trans. Circuits
Syst. I 61 (1) (2014) 202–212.
CLA-FIR method compared to the existing methods. [4] Andrea Bonetti, Adam Teman, Philippe Flatresse, Andreas Burg, "Multipliers–
driven perturbation of coefficients for low-power operation in reconfigurable
FIR filters, IEEE Trans. Circuits Syst. I 64 (9) (2017) 2388–2400.
6. Conclusion [5] Jiajia Chen, Jinghong Tan, Chip-Hong Chang, Feng Feng, "A new cost-aware sen-
sitivity-driven algorithm for the design of FIR filters, IEEE Trans. Circuits Syst.
In this work, the VD-CLA-FIR filter design based on VD algo- I 64 (6) (2017) 1588–1598.
[6] Iqbal, J.L. Mazher, S. Varadarajan, High performance reconfigurable FIR filter
rithm with CLA approach was implemented in Modelsim by em- architecture using optimized multiplier, Circuits Syst. Signal Process. 32 (2)
ploying Verilog code. Initially, the ECG signals are read in MATLAB (2013) 663–682.
which can be affected by AWGN. That noisy value has been writ- [7] C. Xu, S. Yin, Y. Qin, H. Zou, A novel hardware efficient FIR filter for wireless
sensor networks. in ubiquitous and future networks (ICUFN), in: 2013 Fifth In-
ten in a text file which is given to this input of Verilog. In this ternational Conference on, IEEE, 2013, July, pp. 197–201.
paper, the VD algorithm performed the multiplication operation. [8] Sumalatha Madugula, Panchala Venkata Naganjaneyulu, Kodati Satya Prasad,
The overall FIR filter has been performed effectively which is used Implementation of FIR filter for low power and area minimization
using shift–add method without multipliers, Int. J. Intell. Eng. Syst. 10 (6)
to remove the noise from the ECG signal. By using the de-noised (2017).
signal, MSE, SNR, BER have been evaluated in MATLAB software. [9] D. Shi, Y.J. Yu, Design of linear phase fir filters with high probability of achiev-
Verilog code is used to evaluate the FPGA and ASIC performances. ing minimum number of adders, IEEE Trans. Circuits Syst. I 58 (1) (2011)
126–136.
In FPGA implementation, LUT, slice, and flip-flop are reduced in
[10] S. Bhattacharjee, S. Sil, A. Chakrabarti, Evaluation of power efficient FIR filter
VD-CLA-FIR filter design. In 180 nm technology, 42.39% of the area, for FPGA based DSP applications, Procedia Technol. 10 (2013) 856–865.
29.53% of delay, 43.89% of APP, 70.41% of ADP reduced in VD-CLA- [11] S. Harize, M. Benouaret, N. Doghmane, A methodology for implementing dec-
FIR. In 45 nm technology, 13.2% of the area, 32.25% of the delay, imator FIR filters on FPGA, AEU-Int. J. Electron. Commun. 67 (12) (2013)
993–1004.
24.37% of APP, and 39.02% of ADP reduced in VD-CLA-FIR method [12] Y.C. Tsao, K. Choi, Area-efficient VLSI implementation for parallel linear-phase
compared to the conventional method. In the future, different kind FIR digital filters of odd length based on fast FIR algorithm, IEEE Trans. Circuits
of architecture will be used to improve the hardware performance Syst. II 59 (6) (2012) 371–375.
[13] U. Meyer-Baese, G. Botella, D.E. Romero, M. Kumm, Optimization of high speed
with effective denoising process. pipelining in FPGA-based FIR filter design using genetic algorithm, in: Inde-
pendent Component Analyses, Compressive Sampling, Wavelets, Neural Net,
Biosystems, and Nanoengineering X, 8401, International Society for Optics and
Declaration of Competing Interest Photonics, 2012, May, p. 84010R.
[14] S. Khan, Z.A. Jaffery, Low power FIR filter implementation on FPGA using par-
There is no conflict of interest. allel distributed arithmetic, in: India Conference (INDICON), 2015 Annual IEEE,
IEEE, 2015, December, pp. 1–5.
[15] B. Rashidi, B. Rashidi, M. Pourormazd, Design and implementation of low
Supplementary materials power digital FIR filter based on low power multipliers and adders on xil-
inx FPGA, in: Electronics Computer Technology (ICECT), 2011 3rd International
Conference on, 2, IEEE, 2011, April, pp. 18–22.
Supplementary material associated with this article can be [16] Sang Yoon Park, and Pramod Kumar Meher,"Efficient FPGA and ASIC realiza-
found, in the online version, at doi:10.1016/j.micpro.2019.102883. tions of DA-Based reconfigurable FIR digital filter", IEEE Transactions on Cir-
cuits and Systems-II: Express Briefs.
[17] Jiajia Chen, Jinghong Tan, Chip-Hong Chang, Feng Feng, A new cost-aware sen-
References sitivity-driven algorithm for the design of FIR filters, IEEE Trans. Circuits Syst.
I PP 99 (2016) 1–11.
[18] B. Doss, K. Soundararajanm, Y. Narasimha Murthy, Low-Power and low-area
[1] X. Lou, P.K. Meher, Y. Yu, W. Ye, Novel structure for area-efficient implemen-
adaptive FIR filter based on DA using FPGA, Int. J. Sci. Res. Manage. 3 (1)
tation of FIR filters, IEEE Trans. Circuits Syst. Express Briefs 64 (10) (2017)
(2015).
1212–1216.
[19] Sang Yoon Park, Pramod Kumar Meher, Low-power, high-throughput, and
[2] S.J. Lee, J.W. Choi, S.W. Kim, J. Park, A reconfigurable FIR filter architecture to
low-area adaptive FIR filter based on distributed arithmetic, IEEE Trans. Cir-
trade off filter performance for dynamic power consumption, IEEE Trans. Very
cuits Syst. II 60 (6) (2013) 346–350.
Large Scale Integr. VLSI Syst. 19 (12) (2011) 2221–2228.
M. Sumalatha, P.V. Naganjaneyulu and K.S. Prasad / Microprocessors and Microsystems 71 (2019) 102883 13

Dr K. Satya Prasad received B Tech. degree in Elec-


[20] Martin Kumm, Konrad Möller, Peter Zipf, "Reconfigurable FIR filter using dis-
tronics and Communication Engineering from JNTU col-
tributed arithmetic on FPGAs." Circuits and systems (ISCAS), 2013 IEEE Inter-
lege of Engineering, Anantapur, Andhra Pradesh, India in
national Symposium on IEEE, 2013.
1977 and M.E. degree in Communication Systems from
[21] S. Ramanathan, Gorty Anand, Prasanth Reddy, Sri Adibhatla Sridevi, Low power
Guindy college of Engg., Madras University, Chennai, In-
adaptive FIR filter based on distributed arithmetic, Int. J. Eng. Res. Appl. 6 (5)
dia in 1979 and Ph.D. from Indian Institute of Technology,
(2016) 47–51.
Madras in 1989.He started his teaching carrier as Teach-
[22] Madugula Sumalatha, Panchala Venkata Naganjaneyulu, Kodati Satya Prasad,
ing Assistant at Regional Engineering College, Warangal
Low power and low area VLSI implementation of Radix- 8 Carry look ahead
in 1979.He joined JNT University, Hyderabad as Lecturer
adder FIR filter for DSP applications, J. Adv. Res. Dyn. Control Syst. 10 (10-Spe-
in 1980 and served in different constituent colleges viz.,
cial Issue) (2018) 70–82.
Kakinada, Hyderabad and Anantapur and at different ca-
pacities viz., Associate Professor, Professor, Head of the
M. Sumalatha received B.Tech degree in ECE from JNTU Department, Vice Principal and Principal. In JNTUK Uni-
Hyderabad & M.TECH from JNTUH. She is now a Research versity he served as Director of Evaluation, Director IST and Rector. He has pub-
Scholar of JNTU Kakinada and working as an Associate lished more than 250 technical papers in various National & International Confer-
Professor in ECE Department,Sai Spurthi Institute of Tech- ences and Journals and authored Four Text books. He has guided 33 Ph.D. scholars
nology,B.Gangaram.Her research interests in Signal Pro- and at present 20 scholars are working with him. His areas of Research include
cessing through VLSI. She published her research work in Communications, Signal Processing, Image Processing, Neural Networks & Ad-hoc
several journals and conferences. wireless networks etc. Dr Prasad is a Fellow member of various professional bodies
like IEEE, IETE, IE (I), and ISTE.After retiring from JNTUK service, Prasad worked as a
Pro Vice Chancellor of KL University. At present he is working as Rector of Vignan’s
Foundation for Science, Technology and Research,Guntur,Andhra Pradesh.

Dr P.V. Naganjaneyulu received Ph.D. in Communica-


tions from JNTU Kakinada and he is currently profes-
sor and Principal of Sri Mittapalli College of Engineer-
ing,Tummalapalem,Guntur District.A.P. He has guided 3
Ph.D. scholars and at present 12 scholars are working
with him. He was specialized in communication and sig-
nal processing,. He published more than 50 journal pa-
pers and 10 conference papers. His current research area
of interest is communications, VLSI signal processing and
image processing. He is a member of IETE.

You might also like