FIR Filter Design
FIR Filter Design
Low power and low area VLSI implementation of vedic design FIR
filter for ECG signal de-noising
M. Sumalatha a,∗, P.V. Naganjaneyulu b, K. Satya Prasad c
a
ECE Department, Sai Spurthi Institute of Technology, B.Gangaram, Telangana 507303, India
b
Sri Mittapalli College of Engineering, Tummalapalem, NH16, Guntur, Andhra Pradesh, India
c
Rector of Vignan’s Foundation for Science, Technology & Research, Guntur, Andhra Pradesh, India
a r t i c l e i n f o a b s t r a c t
Article history: In recent years, Finite Impulse Response (FIR) filter plays a major role in signal processing applications.
Received 28 May 2019 Earlier many research papers are described the different types of FIR filter design. But, none of the pa-
Revised 7 August 2019
per explained about signal denoising application with an effective multiplier design. In this paper, Vedic
Accepted 29 August 2019
Design - Carry Lookahead Adder FIR filter architecture is introduced to perform the FIR filter operation
Available online 30 August 2019
with Electro Cardiogram (ECG) signal de-noising application. By usingthe MATLAB program, the input
Keywords: ECG signal is read and Additive White Gaussian Noise (AWGN) is added to the input signal. The denois-
Carry look ahead-adder ing process is implemented in Verilog and the obtained output is written in text files. For de-noising the
Electrocardiogram signal, the binary text values are read in MATLAB. With the help of Verilog code, FPGA performance (LUT,
Finite impulse response flip flop, slices, and frequency) and ASIC performance (area, power, and delay) are evaluated. For ASIC
Signal processing implementation, 180 nm and 45 nm technology are used and for FPGA implementation Virtex-4, Virtex-5,
Vedic design
and Virtex-6 devices are used to evaluate the performance. The Mean Square Error (MSE), Bit Error Rate
(BER), and Signal to Noise Ratio (SNR) performance are calculated from the de-noised signal. In 180 nm
technology, 42.39% of the area, 29.53% of delay, 43.89% of APP, 70.41% of ADP reduced in VD-CLA-FIR. In
45 nm technology, 13.2% of the area, 32.25% of the delay, 24.37% of APP, and 39.02% of ADP reduced in
VD-CLA-FIR method compared to the conventional methods.
© 2019 Elsevier B.V. All rights reserved.
∗
Corresponding author.
E-mail address: [email protected] (M. Sumalatha).
https://doi.org/10.1016/j.micpro.2019.102883
0141-9331/© 2019 Elsevier B.V. All rights reserved.
2 M. Sumalatha, P.V. Naganjaneyulu and K.S. Prasad / Microprocessors and Microsystems 71 (2019) 102883
This research work is composed as follows, Section 2 presents Solutions: To overcome the above mentioned drawbacks, VD-
an extensive survey on recent papers based on FIR filter design. CLA-FIR architecture is implemented in this paper. The proposed
Section 3 briefly described problem statement and Section 4 ex- Vedic multiplier is designed using Urdhava Tiryakbhagyam Sutra
plained the FIR architecture by using VD algorithm with CLA ap- (Vertically and Crosswise Algorithm).In this algorithm, the partial
proach. In Section 5, comparative experimental result of proposed product and their sum are calculated in parallel and because of
VD-CLA-FIR filter design and conventional methods is presented. this, the multiplier is independent of the clock frequency of the
The conclusion of this work made in Section 6. processor. This method is implemented in the FIR design method-
ology for enhancing filtering functions in eliminating unwanted
noise. the Vedic Multiplier based FIR filter design would be a good
2. Related work
choice for high-speed DSP applications. This VD-CLA-FIR architec-
ture helps to perform the ECG denoising application. In PE, Vedic
Researchers have suggested several methods on the FIR archi-
multiplier is used to perform the multiplication operation which
tecture. This section presents a brief evaluation of some significant
helps to reduce the hardware utilization. The optimal CLA adder is
contribution to the field of FIR architecture.
also used to perform the FIR filter. Finally, the process speed and
Author Methodology Advantage Disadvantage area is improved in this method.
Chen et al. [17]. A new cost aware This proposed High computational
Sensitivity Driven method reduced complexity 4. Vedic- CLA - FIR filter methodology
Algorithm (SDA) is 20.9% of the area
used for the design and 72.7% of the
of FIR filter power based on the Step 1: The overall block diagram is shown in Fig. 1. From MAT-
different size of FIR LAB, ECG signal is read from the arrhythmia data base which
filter design
Doss, A pipelined A high throughput This method used a
is taken from an internet source.
Soundararajanm, architecture used rate achieved by the faster bit clock for Step 2: In that ECG signal, White Gaussian Noise (WGN) is
and Narasimha for adaptive FIR updated LUT. This carrying save added. Here, noise density will be 0.1 to 0.5. In MATLAB,
Murthy [18] filter design based method reduced the accumulation, but
“awgn” function is used to add the noise from the input sig-
on the DA algorit sampling period and that used a very
hm. area complexity slower clock for nal.
remaining FIR filter Step 3: With the help of dec2bin MATLAB function, the WGN
operations
with ECG signals are converted into the binary format.
Park et al. [19] A low power, high The throughput rate Low speed for all
throughput, low of the FIR filter other filter Step 4: That binary signal is written in text file (Ex.
area adaptive FIR increased by using operations Noisy_ECG_signal.txt).
filter design based parallel Lookup Table Step 5: This Noisy_ECG_signal.txt file is given to the input of
on the DA algorit (LUT). The less
hm power consumption Verilog. Normally, the input value has been generated ran-
achieved by using domly. But, now this noisy binary value is considered as an
carry save input which is going to be stored in RAM. Co-efficient is
accumulation (CSA)
for faster bit clock
stored in ROM.
Martin Kumm, The proposed work The experimental Increase in the Step 6: In the Verilog, the noise is reduced by using Vedic
Konrad Moller, and compared the two results of FIR filter number of slice Design-Carry Look-ahead Adder-FIR (VD-CLA-FIR) filter. In
Peter Zipf [20] FIR architectures DA architecture achieved count
and LUT a reconfiguration
this work, VD-CLA-FIR filter architecture is designed which
multiplication time of less than 100 is shown in Fig. 2.
technique for Field nano seconds (ns) Step 7: Each and every clock cycle accumulator result is stored
Programmable Gate
in a text file. (Ex. Filter_output.txt). By using the Verilog
Array (FPGA)
Ramanathan, A low power The pipelined DA This technique used code, it helps to measure the ASIC performances (area,
Anand, Reddy, and adaptive FIR filter table reduced the carry save power, delay) and FPGA performances (LUT, flip flops, slices,
Sridevi [21] designed based on switching activity accumulator for FIR
and frequency).
the DA algorithm. and power filter architecture, it
The Least Mean consumption occupied more area Step 8: This Filter_output.txt file is given to the MATLAB for get-
Square (LMS) in the FIR filter ting de-noised ECG signal.
algorithm employed design Step 9: From that de-noised ECG signal, MSE, SNR, and BER are
to reduce the Mean
Square Error (MSE) calculated.
between filter
output and the 4.1. Vedic design CLA based FIR filter architecture
desired response.
Table 1
Experimental results of the ASIC performance of an existing and VD-CLA-FIR method.
∗ ∗
Technology Method Length Area (um2 ) Power (nW) Delay (ps) APP (um2 nW) ADP (um2 ps)
Table 2
Reduction percentage of the ASIC performances for the VD-CLA-FIR method.
Technology Window Reduction% of the area Reduction% of delay Reduction% of APP Reduction% of ADP
stored in one more adder. That two adder’s results are given as area and power dissipation. The 16-bit CLA consists of four 4-bit
the input to the final adder. Finally, 8-bit results are delivered in CLA blocks and a carry generator which is shown in Fig. 7.
the output of the Vedic Multiplier design. This is the design for A 4-bit CLAs is required to construct the 16-bit CLA to operate
4-bit Vedic multiplier which is instantiated 2 times and produce all the P and G of internal signals. The CLA adders are commonly
8-bit Vedic design operation which is used for FIR filter architec- implemented as 4-bit modules, which is used to build large size
ture. adders. Overall, the power, delay, and area can be minimized in
the VD-CLA-FIR method by using 16-bit CLA.
Table 3
Experimental results of the FPGA performances for the existing and VD-CLA-FIR method.
5.1. Experimental setup The input ECG signal was read from MATLAB which is shown in
Fig. 8. In the ECG signal, WGN is added which is shown in Fig. 9.
The proposed approach was experimented using 4GB RAM with This noisy data is converted to binary which is written in text files.
3.30 GHz, i3 processor, and 500GB hard disk. The architecture has This text file is given to the Verilog RAM which is considering as
been implemented using Verilog language. MATLAB was used to one of the inputs of the FIR filter.
read the ECG signal and adding WGN. Modelsim 10.5 tool was This Table represents a comparison of the Existing – I, Existing–
used to write a Verilog code and verifying the timing diagram. Xil- II, Existing-III, LC-CSLA-FIR, R8-CLA-FIR, and VD-CLA-FIR. These six
inx 14.4 was used for evaluating FPGA performances like LUT, flip methods were implemented using Verilog and that method’s out-
flop, slices, and frequency. Cadence RTL compiler version 13.1 was puts are tabulated in Table 1. According to the VD-CLA-FIR method,
used to calculate ASIC performances like area, power, and delay. both FPGA and ASIC outputs focused to optimize the hardware uti-
BER, MSE, and SNR are represented as 248.50, 0.141, and 0.1743 re- lization. Table 1 contains a different type of length (tap) like 8-tap,
spectively for input signal without noise and 630.111, 69.711, and 16-tap, and 32-tap. If the number of transistors is optimized, it of-
−0.6543 for Input signal With Noise. fers a better output of the FIR filter design. So, 45 nm technology is
M. Sumalatha, P.V. Naganjaneyulu and K.S. Prasad / Microprocessors and Microsystems 71 (2019) 102883 7
significantly preferable to reduce area, power, and delay. In 45 nm less area, and delay compared to the existing methods. In Figs. 10–
8- tap architecture, the proposed method contains 2302 um2 area, 12, the first three different taps (8-tap, 16-tap, and 32-tap) repre-
74,143 nW power, and 602.4 ps delay. sents as 180 nm technology as well as other three taps represents
Figs. 10–12 show the comparison graphs of area, power and as 45 nm technology.
delay performance for existing and VD-CLA-FIR method. From the Table 2 presents the reduction percentage of area, delay, APP,
comparison graph, it’s clear that the VD-CLA-FIR method consumes and ADP for 8-taps, 16-tap, and 32-tap respectively. This FIR archi-
8 M. Sumalatha, P.V. Naganjaneyulu and K.S. Prasad / Microprocessors and Microsystems 71 (2019) 102883
1 2
n
MSE = Ai − Aˆ i (2)
n
i−1
Fig. 19. Area, power, and delay of 180 nm for 8-tap FIR filter.
12 M. Sumalatha, P.V. Naganjaneyulu and K.S. Prasad / Microprocessors and Microsystems 71 (2019) 102883
FPGA performance of Virtex 4 for 8 taps is shown in Fig. 20, it [3] Chia-Yu Yao, Wei-Chun Hsia, Yung-Hsiang Ho, "Designing hardware-efficient
clearly shows that the LUT, flip-flop, and slice minimized in VD- fixed-point FIR filters in an expanding subexpression space, IEEE Trans. Circuits
Syst. I 61 (1) (2014) 202–212.
CLA-FIR method compared to the existing methods. [4] Andrea Bonetti, Adam Teman, Philippe Flatresse, Andreas Burg, "Multipliers–
driven perturbation of coefficients for low-power operation in reconfigurable
FIR filters, IEEE Trans. Circuits Syst. I 64 (9) (2017) 2388–2400.
6. Conclusion [5] Jiajia Chen, Jinghong Tan, Chip-Hong Chang, Feng Feng, "A new cost-aware sen-
sitivity-driven algorithm for the design of FIR filters, IEEE Trans. Circuits Syst.
In this work, the VD-CLA-FIR filter design based on VD algo- I 64 (6) (2017) 1588–1598.
[6] Iqbal, J.L. Mazher, S. Varadarajan, High performance reconfigurable FIR filter
rithm with CLA approach was implemented in Modelsim by em- architecture using optimized multiplier, Circuits Syst. Signal Process. 32 (2)
ploying Verilog code. Initially, the ECG signals are read in MATLAB (2013) 663–682.
which can be affected by AWGN. That noisy value has been writ- [7] C. Xu, S. Yin, Y. Qin, H. Zou, A novel hardware efficient FIR filter for wireless
sensor networks. in ubiquitous and future networks (ICUFN), in: 2013 Fifth In-
ten in a text file which is given to this input of Verilog. In this ternational Conference on, IEEE, 2013, July, pp. 197–201.
paper, the VD algorithm performed the multiplication operation. [8] Sumalatha Madugula, Panchala Venkata Naganjaneyulu, Kodati Satya Prasad,
The overall FIR filter has been performed effectively which is used Implementation of FIR filter for low power and area minimization
using shift–add method without multipliers, Int. J. Intell. Eng. Syst. 10 (6)
to remove the noise from the ECG signal. By using the de-noised (2017).
signal, MSE, SNR, BER have been evaluated in MATLAB software. [9] D. Shi, Y.J. Yu, Design of linear phase fir filters with high probability of achiev-
Verilog code is used to evaluate the FPGA and ASIC performances. ing minimum number of adders, IEEE Trans. Circuits Syst. I 58 (1) (2011)
126–136.
In FPGA implementation, LUT, slice, and flip-flop are reduced in
[10] S. Bhattacharjee, S. Sil, A. Chakrabarti, Evaluation of power efficient FIR filter
VD-CLA-FIR filter design. In 180 nm technology, 42.39% of the area, for FPGA based DSP applications, Procedia Technol. 10 (2013) 856–865.
29.53% of delay, 43.89% of APP, 70.41% of ADP reduced in VD-CLA- [11] S. Harize, M. Benouaret, N. Doghmane, A methodology for implementing dec-
FIR. In 45 nm technology, 13.2% of the area, 32.25% of the delay, imator FIR filters on FPGA, AEU-Int. J. Electron. Commun. 67 (12) (2013)
993–1004.
24.37% of APP, and 39.02% of ADP reduced in VD-CLA-FIR method [12] Y.C. Tsao, K. Choi, Area-efficient VLSI implementation for parallel linear-phase
compared to the conventional method. In the future, different kind FIR digital filters of odd length based on fast FIR algorithm, IEEE Trans. Circuits
of architecture will be used to improve the hardware performance Syst. II 59 (6) (2012) 371–375.
[13] U. Meyer-Baese, G. Botella, D.E. Romero, M. Kumm, Optimization of high speed
with effective denoising process. pipelining in FPGA-based FIR filter design using genetic algorithm, in: Inde-
pendent Component Analyses, Compressive Sampling, Wavelets, Neural Net,
Biosystems, and Nanoengineering X, 8401, International Society for Optics and
Declaration of Competing Interest Photonics, 2012, May, p. 84010R.
[14] S. Khan, Z.A. Jaffery, Low power FIR filter implementation on FPGA using par-
There is no conflict of interest. allel distributed arithmetic, in: India Conference (INDICON), 2015 Annual IEEE,
IEEE, 2015, December, pp. 1–5.
[15] B. Rashidi, B. Rashidi, M. Pourormazd, Design and implementation of low
Supplementary materials power digital FIR filter based on low power multipliers and adders on xil-
inx FPGA, in: Electronics Computer Technology (ICECT), 2011 3rd International
Conference on, 2, IEEE, 2011, April, pp. 18–22.
Supplementary material associated with this article can be [16] Sang Yoon Park, and Pramod Kumar Meher,"Efficient FPGA and ASIC realiza-
found, in the online version, at doi:10.1016/j.micpro.2019.102883. tions of DA-Based reconfigurable FIR digital filter", IEEE Transactions on Cir-
cuits and Systems-II: Express Briefs.
[17] Jiajia Chen, Jinghong Tan, Chip-Hong Chang, Feng Feng, A new cost-aware sen-
References sitivity-driven algorithm for the design of FIR filters, IEEE Trans. Circuits Syst.
I PP 99 (2016) 1–11.
[18] B. Doss, K. Soundararajanm, Y. Narasimha Murthy, Low-Power and low-area
[1] X. Lou, P.K. Meher, Y. Yu, W. Ye, Novel structure for area-efficient implemen-
adaptive FIR filter based on DA using FPGA, Int. J. Sci. Res. Manage. 3 (1)
tation of FIR filters, IEEE Trans. Circuits Syst. Express Briefs 64 (10) (2017)
(2015).
1212–1216.
[19] Sang Yoon Park, Pramod Kumar Meher, Low-power, high-throughput, and
[2] S.J. Lee, J.W. Choi, S.W. Kim, J. Park, A reconfigurable FIR filter architecture to
low-area adaptive FIR filter based on distributed arithmetic, IEEE Trans. Cir-
trade off filter performance for dynamic power consumption, IEEE Trans. Very
cuits Syst. II 60 (6) (2013) 346–350.
Large Scale Integr. VLSI Syst. 19 (12) (2011) 2221–2228.
M. Sumalatha, P.V. Naganjaneyulu and K.S. Prasad / Microprocessors and Microsystems 71 (2019) 102883 13