

A Peer Reviewed Open Access International Journal

# **Design of Fir Filter Using Efficient Carry Select Adder**

S.L.Sukanya PG Scholar, Dept of ECE, Godavari Institute of Engineering and Technology, Rajahmundry, East Godavari, Andhra Pradesh, India.

### **ABSTRACT:**

In this paper an efficient 8- Tap FIR filter is designed using efficient Carry Select Adder and Wallace Tree Multiplier. Since in digital circuits complexity is increasing day by day efficient performance is the considerable parameter during designing of FIR filters and it is the main contributing factor for the popularity of the DSP systems. Two main operations of FIR filter are addition and multiplication so this paper concerns replacing the multiplier by Wallace tree multiplier and adder by efficient carry select adder in which all the redundant logic operations present in the conventional CSLA are eliminated and proposed a new logic formulation for CSLA. In the proposed adder, the carry select (CS) operation is scheduled before the calculation of final-sum, which is different from the conventional approach. In general Wallace tree multiplier is mostly preferred in order to reduce number of partial products. By employing the above 16- bit Carry Select Adder and 8 -bit Wallace Tree Multiplier an efficient 8- Tap FIR filter is designed.

### **Keywords:**

Fir filter Carry select adder Wallace tree multiplier Low power design.

### I. INTRODUCTION:

Digital Signal Processing is used in wide range of applications such as radio, television, video etc. Its main basic tools are digital filters. Basically two different kinds of filters exists ,they are analog filters and digital filters.in analog filters components like resistors and capicitors are employed in order to generate the required filtering effect .noise reduction and video signal enhancement are certain applications in which analog filters are employed. N.M.R.L.Rao Assistant Professor, Dept of ECE, Godavari Institute of Engineering and Technology, Rajahmundry, East Godavari, Andhra Pradesh, India.

Here in these kind of filters the signal generated is in the form of electrical voltage or current.another classification of filters is digital filters .in these kind of filters ,the analog input signal is passed through an analog to digital converter .the resulting binary numbers are transferred to the processor in which certain numerical calculations are performed .in digital filters, the signal is in the form of numbers rather than a physical quantity. Digital filters are again classified into two types finite impulse response filters and infinite impulse response filters. both filters are having their own advantages and disadvantages .since the advantages of fir filters are dominating the disadvantages they are mostly preferred than iir in digital signal processing applications.

The filtering requires arithmetic operations. The adder and multiplier module consume much circuit area and power. The complexity of the filter is mainly due to the multiplication and addition operations in FIR filter. The adders, Wallace, dadda multipliers are applied for filters to eliminate power consumption due to unwanted data transitions. In digital signal processing, arithmetic operations like multipliers and adders play a major role. The common multipliers used will be Wallace, dadda multipliers. An adder is an essiential component of an arthematic unit .a complex digital signal processing system involves several adders.an efficient adder design is going to enhance the performance of a complex DSP system. The adders used for comparison are Ripple Carry Adder, Carry lookahead adder, Carry save adder and Carry select adder .here a combination of an efficient carry select adder and Wallace tree multiplier is employed in order to design a efficient FIR filter.



A Peer Reviewed Open Access International Journal

The organization of the paper is as follows: section II shows the existing adders and multipliers for FIR filter design, Section III describes the proposed fir filter design using efficient adder and wallace tree multiplier .Results and performance analysis are presented in Section IV and finally conclusion is presented in Section V.

### **II. EXISTING ADDERS AND MULTIPLIERS**

FIR filters basically consists of multipliers ,delay element and adders .the different kinds of multipliers and adders are described below:

### **MULTIPLIERS:**

Basically multiplication involves two main operations one is partial product generation and second one is their accumulation. so, there are two possible ways to increase the efficiency of the multiplier there by eliminating the complexity ,therefore reducing time needed to accumulate the partial products. both solutions are applied simultaneously. multipliers consists of three stages in first stage partial product matrix is formed. secondly the obtained partial product matrix is decreased to a height of two in the final stage the rows are joined by carry propagating adder structure both Wallace and dadda multipliers are employed for reducing partial products.in Wallace multipliers partial products are reduced as quickly as possible. dadda multiplier performs the minimum reduction necessary at each level. since, same number of pseudo adder levels are employed both the multipliers are having same delay. Wallace tree multiplier employs small carry propagation adder compared to dadda multiplier. Both Wallace and dadda multiplier are unsigned multiplier .while baugh wooley multiplier is a signed multiplier. by employing column compression multipliers baugh and wooley have presented the modifications required to use the signed operands.

### Wallace tree multiplier:

To reduces the wide variety of partial merchandise to be introduced into 2 ultimate intermediate effects we use Wallace tree. The important operation of Wallace tree is multiplication of unsigned integer, an efficient hardware to put into effect a digital circuit that multiplies two integers is Wallace tree multiplier, designed with the aid of an Australian computer Scientist Chris in 1964

### Dadda multiplier:

Basically from Wallace parallel multiplers dadda multipliers are derived .in the starting stage of the dadda multipler,by employing N^2 AND gates partial products are formed .secondly the partial product matrix is reduced to height of two .minimal number of (3,2,)and (2,2) counters are employed in the dadda multiplier at each level during compression to achieve reduction.In order to generate 16- bit product ,64 AND gates, 35 (2,2)counters,7 (2,2) counters and 14 – bit carry propagation adder are required .four reduction levels and the matrix heights of 6,4,3 and 2 are required to form 8\*8 dadda multiplier.

### **ADDERS:**

### **Ripple Carry Adder:**

Full adder is the basic unit for RCA . by connecting full adders in cascaded form RCA is constructed . the Cout of the previous 1 – bit full adder is iven as C in to the next 1 –bit full adder . here the carry out ripples through the circuit. The on chip area of the RCA is less and offers high performance to random input data. The RCA delay depends upon the propagation path .because of this reason RCA is not preferred for circuits having non-random input operands . in RCA ,the output is obtained only the carry of the previous stage is produced .thw worst case addition is observed in which the carry signal has rippled through the adder from the LSB stage to the MSB stage then only the MSB is available.

The RCA delay is defined as follows

### $t_{RCA} = nt_{FA}$

Here n representing number of 1 –bit full adder cascaded in RCA and  $t_{FA}$  is corresponding delay of 1-



A Peer Reviewed Open Access International Journal

bit full adder .the crital delay is directly proportional to number of bits n.

### **Carry Look A Head Adder:**

The carry propagation delay is the major problem associated with RCA which gradually increases with number of bits . In RCA ,the carry is supposed to pass through to pass through all the lower bits to generate sum for higher bits .thatswhy for fast applications, a better design is preferred CSLA is one among those . it solves the delay problem, which is the main drawback of RCA. In CSLA the carry signal is calculated in advance based on the input signal .therefore CSLA offers less delay compared to RCA but requires more complex hardware and large on chip area .generate and propagate logic is employed in the CLA .in CLA the carry delay and sum delay are independent of number of bits one need to add which was the main advantage of carry look a head adder. The drawback of CSLA is that the carry logic becomes complex for more then 4 -bits.

### **Carry Select Adder:**

In RCA, every full adder in cascade were supposed to wait for the Cin signal in order to generate Cout, which is the main reason for delay.in order to generate Cout ,which is the main reason for delay.in order to avoid this problem both values of Cin i.e, 0 and 1 are assumed then after the required results for both possibilities are evaluated in advance . by knowing the correct values of Cin ,the correct results is chosen by means of 2:1MUX. This is the main idea behind the concept of CSLA.so,the CSA computes two results in parallel.each result is calculated for two different vaues of carry in . the CSLA is simpler and faster adder. The addition of two n-bit (a,b) numbers is performed by portioning the input into two blocks .for every CSA block ,sum and carry values are propagated for both Cin 0 and 1. The actual Cout value is send to MUX to pick the actual sum and Cout for the next block .the total amount of delay is reduced in CSLA with its approach.

### **Carry Save Adder:**

In CSA ,carries are saved as partial carries without propagating .during next addition ,these partial carries are added to the next operand .by postponing the carry propagation one can accelerate each addition .multiple operand addition and carry propagate addition are two steps involved in carry save adder . a CSA adds a partial sum and partial carry from the previous stage as well as operand and a new partial sum and carry were produced .

### **Carry Skip Adder:**

Carry by pass adder is also defined as carry skip adder .it is an implementation in which carry delay is improved somewhat in this adder with little effort .several carry skip adders are implemented in order to improve worst case delay by forming a block carry skip adder.

### III. PURPOSED FIR FILTER DESIGN:

In recent times, many finite impulse Response (FIR) clear out designs geared toward both low vicinity-price or excessive velocity or reduced strength intake are advanced, we will observe that, with the increase in location, hardware cost of these FIR filters are growing. This commentary leads me to layout a low place-value FIR filter with the benefits of decreased energy consumption and slight velocity overall performance. To reduce the hardware value, the hardware place must be optimized. Multipliers eat the most quantity of area in a FIR clear out layout. manufactured from two numbers has twice the original bit width of the extended numbers. we can truncate the product bits to the required precision to reduce the vicinity cost . conventional multipliers are changed through a modified booth multiplier right here. changed Wallace tree and dadda set of rules. It produces best half of the number of partial merchandise (PPs) while compared with an normal binary multiplication.

### **Design of FIR Filter:**

The FIR filter expressed as



A Peer Reviewed Open Access International Journal

$$y(n) = \sum_{k=0}^{N} X(n-k) * H(k)$$

y(n)=Output signals of FIR filter X(n)- $\rightarrow$  Input signals H(k)- $\rightarrow$ set of coefficients

The implementation of FIR filters need multiplication, adder and signal delay. The multiplication implemented by using Wallace tree and dadda algorithms. The adder circuit designed by area and delay based carry select adder circuits. basic structure mention in below figure.



**Fig:1 FIR Filter Design** 

In this paper proposed FIR filter is designed using Carry Select adder and Wallace tree multiplier



### Fig:2 Flow of FIR Design

### **Purposed Carry Select Adder:**

Design of subject- and power-efficient high-pace information course logic programs are one of the gigantic areas of research in VLSI procedure design. In digital adders, the pace of addition is limited by the point required to propagate a raise through the adder. The sum for every bit position in an elementary adder is generated sequentially only after the earlier bit function has been summed and a raise propagated into the following position. The CSLA is used in many computational systems to alleviate the drawback of elevate propagation extend by means of independently generating multiple includes and then decide upon a elevate to generate the sum. Howeverthe CSLA will not be area efficient in view that it uses multiple pairs of Ripple carry Adders (RCA) to generate partial sum and lift by way of in view that raise input Cin = 0 and Cin = 1, then the final sum and elevate are chosen by the multiplexers (mux). The fundamental concept of this work is to use Binary to extra-1 Converter (BEC) rather of RCA with Cin = 1 within the typical CSLA to achieve curb discipline and power consumption.

The most important advantage of this BEC logic comes from the lesser number of logic gates than the n-bit Full Adder (FA) structure. The SQRT CSLA has been chosen for evaluation with the proposed design as it has a extra balanced prolong, and requires cut back vigour and subject. The extend and area analysis methodology of the standard and modified SQRT CSLA are offered. The new CSLA is primarily based at the circuit layout shape is shown in Fig. It includes one HSG unit, one FSG unit, one CG unit, and one CS unit. The CG unit is composed of two CGs (CG0 and CG1) similar to input-convey '0' and '1'. The HSG gets n-bit operands (A and B) and generate half-sum phrase s0 and half-carry word c0 of width n bits each. both CG0 and CG1 receive s0 and c0 from the HSG unit and generate n-bit complete-convey words c01 and c11 corresponding to enter-deliver 0' and '1', respectively. The good judgment diagram of the HSG unit is proven in Fig. The common sense circuits of CG0 and CG1 are optimized to take advantage of the constant input-carry bits.



A Peer Reviewed Open Access International Journal



# Fig :3(a) Proposed CS adder design, (b) Gate-level design of the HSG. (c) Gate-level optimized design of (CG<sub>0</sub>) for input-carry = 0. (d) Gate-level optimized design of (CG<sub>1</sub>) for input-carry = 1. (e) Gate-level design of the CS unit. (f) Gate-level design of the final-sum generation (FSG) unit.

The multipath lift propagation function of the CSLA is wholly exploited in the SQRT-CSLA, which consists of a sequence of CSLAs. CSLAs of increasing size are used in the SQRT-CSLA to extract the maximum concurrence in the lift propagation route. Utilizing the SQRT-CSLA design, large-dimension adders are applied with significantly much less extend than a single-stage CSLA of equal size. However, lift propagation prolong between the CSLA stages of SORT-CSLA is critical for the total adder lengthen. Because of early generation of output-lift with multipath elevate propagation function, the proposed CSLA design is extra favorable than the existing **CSLA** designs for area-prolong efficient implementation of SQRT-CSLA. A 16-bit SQRT-CSLA design utilizing the proposed CSLA is proven in Fig.4, the place the two-bit RCA, 2-bit CSLA, 3-bit CSLA, 4-bit CSLA, and 5-bit CSLA are used. We've got considered the cascaded configuration of (2-bit RCA and 2-, 3-, four-, 6-, 7-, and eight-bit CSLAs) and (2-bit RCA and a couple of-, three-, four-, 6-, 7-, eight-, 9-, 11-, and 12-bit CSLAs), respectively, for the 32bit CSLA and the 64-bit SQRT-CSLA to optimize adder extend. To demonstrate the talents of the proposed CSLA design in SQRT-CSLA.



### Fig:4 Proposed 16-bit SQRT-CSLA

### Wallace Tree Multiplier:

In Wallace tree multiplication, for 4bit multiplicand and multiplier there might be 16 partial merchandise. The partial merchandise are fashioned with the aid of making use of AND gates. A parallel (n,m) counter is a circuit which has n inputs and produces m outputs. A full adder is an implementation of a (3,2) counter which takes 3 inputs and produces 2 outputs. In a similar fashion a 1/2 adder is an implementation of a (2,2) counter which takes 2 inputs and produces 2 outputs. Completely 3 levels of partial product discount for 4 bit multiplication. If there are p rows of partial merchandise, rows are grouped and the remainder p mod three rows are handed to the subsequent stage. These rows are summed making use of full adders [(3,2) counters] if there are three partial merchandise in 1 column and making use of half adders[(2,2) counter] if there are 2 partial merchandise in 1 column. The ensuing sum and carry of the whole adder and half of adder are handed on to the subsequent stage. Ultimately in the final stage Full adders are used to receive the product. The height of the matrix in the jth reduction stage, is given by way of the following recursive equations.

$$w_0 = N$$
$$w_{j+1} = 2 \cdot \left\lfloor \frac{w_j}{3} \right\rfloor + w_j \mod 3$$



A Peer Reviewed Open Access International Journal



Fig :5 wallace tree multiplier

The principle of Wallace tree multiplication can be extended to longer wordlengths. Four reduction stages are required with matrix heights of 6, 4, 3 and 2.

### An 8X8 Wallace Tree Multiplier:

It is to be designed using Verilog. The multiplier accepts two 8-bit numbers; multiplicand and multiplier and outcome in 16-bit multiplication. The design is to be optimised for pace. Wallace tree multiplier is made of almost always two add-ons, particularly, half-adder and whole-adder. So first of all, a half-adder and a fulladder are designed. For setting up an 8X8 multiplier, we want 8 half-adders and 48 full-adders i.E. A whole of fifty six adders. Therefore, the 1/2 adder and the entire adder is instantiated for every computation as per the requirement by means of passing the proper parameters. The outcome is obtained from the sum and elevate bits of the adders. Right here in this means both efficient lift decide on adder and Wallace tree multiplier are employed within the designing of FIR filter.When you consider that right here we designed eight-Tapped FIR filter .A 16 bit lift decide on adder and a eight -bit Wallace tree multiplier are employed for the designing.

# IV. SIMULATION RESULTS: RTL Schematics:

The RTL (Register Transfer Logic) can be viewed as black box after synthesize of design is made. It shows the inputs and outputs of the system. By doubleclicking on the diagram we can see gates, flip-flops and MUX.





Fig :6 RTL Schematic Of 8-Tap Fir Filter



A Peer Reviewed Open Access International Journal



Fig :7 test Bench For 8 Tapped Fir Filter

### WAVEFORM:



### V. CONCLUSION AND FUTURE WORK:

In this paper, an efficient fir filter is implemented using an efficient carry select adder and Wallace tree multiplier. Wallace tree multiplier delivers better performance than Dadda multiplier for the FIR filter architectures. The Wallace tree multiplier has the smallest critical path delay as compared to Dadda Here the input samples and filter multiplier. coefficients are applied as inputs to the multiplier and adders. The implementation results of proposed and traditional architectures based totally on Xilinx . Using Verilog HDL the synthesis and simulation is done and these are performed on Xilinx ISE 14.4 version software tool. Modelsim is used for simulation and Xilinx is used for evaluating the response for the existing and proposed architecture. The purposed work can be extended to adaptive filters.

### **REFERENCES:**

[1] Hunsoo Choo, Khurram Muhammad, and Kaushik Roy, "Two's Complement Computation Sharing Multiplier and Its Applications to High Performance DFE," IEEE Transactions On Signal Processing, Vol. 51, No. 2, Feb. 2003. [2] Richard Hartley, "Optimization Of Canonic Signed Digit Multipliers For Filter Design," IEEE International Sympoisum on Circuits and Systems, Vol. 4, Jun 1991.

[3] C.R.Baugh and B.A. Wooley, "A two's complement parallel array multiplication algorithm," IEEE Trans. On Computers, vol.22, pp. 1045-1047, 1973.

[4] K.A.C.Bickerstaff, E.E.Swartzlander Jr., and M.J.Schulte, "Analysis of column compression multipliers," 15th IEEE Symp. On Computer Architecture, pp.33-39, 2001.

[5] C.S.Wallace, "A suggestion for a fast multiplier," IEEE Trans. On Computers, vol. 13, pp. 14-17, 1964.

[6] E.E. Swartzlander Jr., "Merged arithmetic," vol.29, pp. 946-950, 1980.

[7] S. White, "Applications of Distributed Arithmetic to Digital Signal Processing: A Tutorial Review," IEEE ASSP Magazine, July 1989, pp. 4–19.

[8] KeshabK.Parhi, "VLSI Digital Signal Processing," Wiley

[9] K. K. Parhi, VLSI Digital Signal Processing. New York, NY, USA:Wiley, 1998.

[10] A. P. Chandrakasan, N. Verma, and D. C. Daly, "Ultralow-power electronics for biomedical applications," Annu. Rev. Biomed. Eng., vol. 10, pp. 247–274, Aug. 2008.

[11] O. J. Bedrij, "Carry-select adder," IRE Trans. Electron. Comput., vol. EC-11, no. 3, pp. 340–344, Jun. 1962.

[12] Y. Kim and L.-S. Kim, "64-bit carry-select adder with reduced area," Electron. Lett., vol. 37, no. 10, pp. 614–615, May 2001.



A Peer Reviewed Open Access International Journal

[13] Y. He, C. H. Chang, and J. Gu, "An area-efficient 64-bit square root carry select adder for low power application," in Proc. IEEE Int. Symp. Circuits Syst., 2005, vol. 4, pp. 4082–4085.

[14] B. Ramkumar and H.M. Kittur, "Low-power and area-efficient carry-select adder," IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 20, no. 2, pp. 371–375, Feb. 2012.

[15] I.-C. Wey, C.-C. Ho, Y.-S. Lin, and C. C. Peng, "An area-efficient carry select adder design by sharing the common Boolean logic term," in Proc. IMECS, 2012, pp. 1–4.

[16] S.Manju and V. Sornagopal, "An efficient SQRT architecture of carry select adder design by common Boolean logic," in Proc. VLSI ICEVENT, 2013, pp. 1–5.

[17] B. Parhami, Computer Arithmetic: Algorithms and Hardware Designs, 2nd ed. New York, NY, USA: Oxford Univ. Press, 2010.

Volume No: 3 (2016), Issue No: 10 (October) www.ijmetmr.com

October 2016