

A Monthly Peer Reviewed Open Access International e-Journal

# A Novel architecture of 64 bit MAC unit using Dadda multiplier

T. Murali Krishna M.Tech Student, Rise Gandhi Group Of Institutions. S.K. Rasool Asst Professor, Rise Gandhi Group Of Institutions.

# **ABSTRACT:**

MAC unit is an inevitable many digital signal processing (DSP) applications involving multiplications and/or accumulations.MAC unit is used for high performance digital signal processing systems. The DSP applications includefitering, convolution, and inner products. Most of digital signal processing methods use nonlinear fuctions such as discrete cosine transform (DCT) or discrete wavelet transforms (DWT). Because they are basically accomplished by repetitive application of multiplication and addition, the speed of the multiplication and addition arithmetic determines the execution speed and performance of the entire calculation[1].

The Multiplication-and-accumulate(MAC) operations are typical for digital fiters. Therefore, the fuctionality of the MAC unit enables high-speed fitering and other processing typical for DSP applications. Since the MAC unit operates completely independent of the CPU, it can process data separately and thereby reduce CPU load. The application like optical communication systems which is based on DSP , require extremely fastprocessing of huge amount of digital data. The Fast-Fourier Transform (FFT) also requires addition and multiplication. 64 bit can handle larger bits and have more memory.

#### **1.INTRODUCTION:**

With the recent rapid advances in multimedia and communication systems, real-time signal processing like audio signal processing, video/image processing, or large-capacity data processing are increasingly being demanded. The multiplier and multiplier-and-accumulator (MAC) are the essential elements of the digital signal processing such as filtering, convolution, transformations and Inner products. There are different entities that one would like to optimize when designing a VLSI circuit.

Volume No: 1(2014), Issue No: 11 (November) www.ijmetmr.com

These entities can often not be optimized simultaneously, only improve one entity at the expense of one or more others The design of an efficient integrated circuit in terms of power, area, and speed simultaneously, has become a very challenging problem. Power dissipation is recognized as a critical parameter in modern the objective of a good multiplier is to provide a physically compact, good speed and low power consuming chip. This paper proposes a new architecture of multiplierand-accumulator (MAC) for high speed and low-power by adopting the new SPST implementing approach. This multiplier is designed by equipping the Spurious Power Suppression Technique (SPST) on a modified Booth encoder which is controlled by a detection unit using an AND gate. The modified booth encoder will reduce the number of partial products generated by a factor of 2. The SPST adder will avoid the unwanted addition and thus minimize the switching power dissipation. By combining multiplication with accumulation and devising a low power equipped carry save adder (CSA), the performance was improved.



November 2014 Page 277



A Monthly Peer Reviewed Open Access International e-Journal

# **I.CONVENTIONAL MODIFIED :**

Wallace tree multiplier consists of three stepprocess, in the first step, the bit product terms are formeafter the multiplication of the bits of multiplicand and multiplier, in second step, the bit product matrix is reduced to lower number of rows using half and full adders, this process continues till the last addition remains, in the final step, final addition is done using adders to obtain the result. The benefit of the Wallace tree is that there are only O(logn) reduction layers, and each layer has O(1) propagation delay. As making the partial products is O(1) and the final addition is O(logn), the multiplication is only O(logn), not much slower than addition (however, much more expensive in the gate count). Naively adding partial products with regular adders would require O(log2n) time. perspective, the Wallace tree algorithm puts multiplication in class .These computations only consider gate delays and don't deal with wire delays, which can also be very substantial.

# Architecture of the Wallace Tree Multiplier:



# **MODIDFIED 4\*4 WALLACE TREE MULTIPLIER:**

The basic multiplication principle is two fold i.e. evaluation of partial products and accumulation of the shifted partial products. It is performed by the successive additions of the columns of the shifted partial product matrix. The 'multiplier' is successfully shifted and gates the appropriate bit of the 'multiplicand'.

The delayed, gated instance of the multiplicand must all be in the same column of the shifted partial product matrix. They are then added to form the product bit for the particular form. Multiplication is therefore a multi operand operation. To extend the multiplication to both signed and unsigned.

#### 4.parallel prefix adder:

The binary adder is the critical element in most digital circuit designs including digital signal processors (DSP) and microprocessor data path units. As such, extensive research continues to be focused on improving the power delay performance of the adder. In VLSI implementations, parallel-prefix adders are known to have the best performance.

Parallel-prefix adders (also known as carry-tree adders) are known to have the best performance in VLSI designs. However, this performance advantage does not translate directly into FPGA implementations due to constraints on logic block configurations and routing overhead. This paper investigates three types of carry-tree adders (the Kogge-Stone, sparse Kogge-Stone, and spanning tree adder).

# Fig. parallel prefix adder:



# **Proposed multiplier:**

The present Modified Booth Encoding (MBE)multiplier and the Baugh-Wooley multiplier perform multiplication operation on signed numbers only. The array multiplier and Braun array multipliers perform multiplication operation on unsigned numbers only. Thus, the requirement of the modern computer system is a dedicated and very high speed unique multiplier unit for signed and unsigned numbers.

Therefore, this paper presents the design and implementation of SUMBE multiplier. The modified Booth Encoder circuit generates half the partial products in parallel. By extending sign bit of the operands and generating an additional partial product the SUMBE multiplier is obtained.



A Monthly Peer Reviewed Open Access International e-Journal

The Carry Save Adderr (CSA) tree and the final Carry Lookahead (CLA) adder used to speed up the multiplier operation. Since signed and unsigned multiplication operation is performed by the same multiplier unit the required hardware and the chip area reduces and this in turn reduces power dissipation and cost of a system.

when the input is given to the multiplier it starts computing value for the given 64 bit input and hence the out put will be 128 bits. . The multiplier output is given as the input tocarry save adder which performs addition.

# **IV. SIMULATION RESULTS:**

**ACCUMULATOR:** 

The Multiplier-Accumulator (MAC) operation is signal processor.the input whis is fed from the memory location of 64 bit.

Verilog code is written to generate the required hardware and to produce the partial product, for MAC.







(b)

Volume No: 1(2014), Issue No: 11 (November) www.ijmetmr.com



A Monthly Peer Reviewed Open Access International e-Journal



(c) Fig: Simulation results

### **V. CONCLUSION:**

In this paper, we present a 64 bit mac that was implemented using booths multiplier.the adder circuit that was used is a parallel prefix adder.in the multiplier design to add the partial products we use carry select adder and carry look ahead adder.the delay of the proposed system is very less when compared to the previous design.

#### **REFERENCES:**

[1]W. –C. Yeh and C. –W. Jen, "High Speed Booth encoded Parallel Multiplier Design," IEEE transactions on computers, vol. 49, no. 7, pp. 692-701, July 2000.

[2]Shiann-Rong Kuang, Jiun-Ping Wang, and Cang-Yuan Guo, "Modified Booth multipliers with a Regular Partial Product Array," IEEE Transactions on circuits and systems-II, vol 56, No 5, May 2009.

[3]Li-Rong Wang, Shyh-Jye Jou and Chung-Len Lee, "A well-tructured Modified Booth Multiplier Design" 978-1-4244-1617-2/08/\$25.00 ©2008 IEEE.

[4]Soojin Kim and Kyeongsoon Cho "Design of Highspeed Modified Booth Multipliers Operating at GHz Ranges" World Academy of Science, Engineering and Technology 61 2010. [5]Magnus Sjalander and Per Larson-Edefors. "The Case for HPM-Based Baugh-Wooley Multipliers," Chalmers University of Technology, Sweden, March 2008.

[6]J. Fadavi-Ardekani, <sup>a</sup>M×N Booth Encoded Multiplier Generator Using Optimized Wallace Trees,<sup>o</sup> IEEE Trans. VLSI Systems, vol. 1,no. 2, June 1993.

[7]Wang, G., "A unified unsigned/signed binary multiplier", TheThirty-Eighth Asilomar Conference on Signals, Systems andComputers, 2004, Vol. 1, pp.:513 - 516, Nov 7-10, 2004.

[8]Kim J. Y., "Multiplier to selectively perform unsigned magnitude multiplication or signed magnitude multiplication", USpatent 5,870,322, Feb 9, 1999.

[9]Hwang-Cherng Chow and I-Chyn Wey, "A 3.3V 1GHz high speed pipelined Booth multiplier," Proc. of IEEE ISCAS, vol.1, pp. 457-460, May 2002.

[10]M. Aguirre-Hernandez and M. Linarse-Aranda, "Energy-efficient high-speed CMOS pipelined multiplier," Proc. of IEEE CCE, pp. 460-464, Nov. 2008.

[11]A. D. Booth, "A signed binary multiplication technique," Quarterly J. Mechanical and Applied Math, vol. 4, pp.236-240, 1951.

[12]Kuang S. R., Wang J. P., Guo C. Y., "Modified Booth Multipliers With a Regular Partial Product Array",IEEETransactions on Circuits and Systems II: Express Briefs,Vol.56, Issue 5, pp.:404 - 408, May, 2009.