

A Peer Reviewed Open Access International Journal

# VLSI Architecture For Fused Add Then Multiply Functions

Vakkala Prudhviraj Yadav PG Scholar, Dept of ECE, Global College of Engineering & Technology, Kadapa, YSR (Dt), AP, India.

#### **ABSTRACT:**

Booth recoding is widely used to reduce the number of partial products in multipliers. The benefit is mainly an area reduction in multipliers with medium to large operand widths (8 or 16 bits and higher) due to the massively smaller adder tree, while delays remain roughly in the same range. Different recordings exist resulting in different gatelevel implementations and performance.

In this work the XOR-based implementation gives lowest area and delay numbers in most technologies due to the small selector size and the well-balanced signal paths.An implementation of a radix-4 butterfly has been developed. The number of stages has been reduced. This reduction comes from the fact that, to achieve a throughput comparable to that of radix-2.

Therefore, the implementation of the radix-4 butterfly is suitable for high speed applications, since the hardware cost, the power consumption and the latency are reduced. To reduce the number of calculation steps for the partial products, MBA algorithm has been applied mostly where Wallace tree has taken the role of increasing the speed to add the partial product.

## **Keywords:**

MB recoding, Add-Multiplyoperation, arithmetic circuits, CLAAdder, VLSI design.

## **I.INTRODUCTION:**

The existing recoding schemes may provide efficient implementation, but the disadvantage is that they use complex manipulations at bit level with the circuits implemented in gate level. A.Asha

Assistant Professor, Dept of ECE, Global College of Engineering & Technology, Kadapa, YSR (Dt), AP, India.

The authors proposed an efficient Sum to Modified Booth (S-MB) recoder for implementing AM unit using a Radix-4 algorithm. The S-MB recoder is efficient and structured. With the increase in the radix number, the number of partial products gets reduced and hence the hardware and delay. So the main focus of this work is the design and implementation of Radix-8 Modified Booth Recoder that yield better performance when implemented with Add-Multiply Unit (AM).

Compared to the Radix-4 design, the modified Radix-8 MB Recoder design is simple, structured, better in performance and can be easily modified for any higher radix. This proposed FAM unit can be used in Signal Processing applications such as Fast Fourier Transform (FFT). Figure 1.1 shows the conventional and modified design of AM Unit.



Fig. 1: Add multiply unit (a) conventional design (b) modified design

## **II.RELATED WORK:**

Number of promising technologies shows an enormous advancement of multiplier over the past few decades. The array multiplier was an earliest reported multiplier that



A Peer Reviewed Open Access International Journal

employs a series of ripple carry adders to compute the product by repetitive addition. It has regular structure but the speed of this multiplier is relatively slow [7]. The shortcomings of array multiplier are resolved by Wallace tree multiplier. The Wallace tree construction method is used to accelerate the multiplication by compressing the number of partial products in a tree-like fashion and produce two rows of partial products that can be added by utilizing the suitable adder in the last stage. Generally, Wallace tree multiplier is used to reduce the time complexity and the depth of the adder chain. In high speed multipliers, 4:2 compressors are used extensively to curb the time taken at the partial product accumulation stage. By virtue of its regular interconnection, 4:2 compressorsare used to construct regularly structured Wallace tree multiplier with reduced complexity [8]. In the S-MB2 recoding mechanism, the sum of two continuous bits of two

inputs  $A(a_{2j}, a_{2j+1})$  and  $B(b_{2j}, b_{2j+1})$  are

recoded into single MB digit  $Y_j^{MB}$ . In general, three bits are comprehended in forming a MB digit. The most significant bit of them has negative weight but the two least significant bits are positively weighted and signed-bit arithmetic is used to transform the above pairs of bits into MB form. Bit-level signed Half Adders (HA) and signed Full Adders (FA) was used for this purpose. Two types of signed HAs such as HA\* and HA\*\* are used. The Boolean equation for half adder HA\* is given by  $c = p \lor q$ ,  $s = p \oplus q$ , where p and

q are the binary inputs and c, s are the carry and sum outputs respectively. Fig. 2(a) symbolizes the schematic of HA\*\*. Two types of signed FAs such as FA\* and FA\*\* are used as a building block in the S-MB recoders. Boolean equations and schematics for signed FA\* and FA\*\* are given inFig. 2(b) and Fig. 2(c) respectively. Here p and q are the inputs and Ci,S are the output carry and sum respectively. FA\* implements The relation  $2.c_0 - s = p - q + c_i$ where the bits sand Ciare negatively signed. In FA\*\*, the two inputs p and qare negatively signed and FA\*\* implements the



Fig. 2 Schematic for signed (a) ha\*, (b) fa\*and (c) fa\*\*

#### **III.IMPLEMENTATION:**

In this paper, we design a circuit of AM unit which implement the operation Z=X (A+B). The conventional design of the AM operator (Fig. 1(a)) requires that its inputs A and B are fed to an adder and then the input X and the sum Y=A+B is fed to a multiplier to get the final result Z. The drawback of this method is the delay is high. To reduce the delay we use Carry-Look-Ahead adder but this increases the area of the design and thereby increasing the power consumption. By using the direct recoding of sum to modified booth form we can reduce the delay and power consumption.



Volume No: 2 (2015), Issue No: 10 (October) www.ijmetmr.com October 2015 Page 299



A Peer Reviewed Open Access International Journal

Fig. 3. Add-multiply operator based on the (a) conventional design and (b) fused design using direct sum to modified booth recoding.

|                  | Binary |                 | Y <sub>j</sub> <sup>MB</sup> |                     | Input               |         |       |
|------------------|--------|-----------------|------------------------------|---------------------|---------------------|---------|-------|
| Y <sub>2+1</sub> | Yaj    | $Y_{2j\cdot l}$ |                              | Sign=S <sub>j</sub> | X1=One <sub>j</sub> | X2=Twoj | carry |
| 0                | 0      | 0               | 0                            | 0                   | 0                   | 0       | 0     |
| 0                | 0      | 1               | +l                           | 0                   | 1                   | 0       | 0     |
| 0                | 1      | 0               | +1                           | 0                   | 1                   | 0       | 0     |
| 0                | 1      | 1               | +2                           | 0                   | 0                   | 1       | 0     |
| 1                | 0      | 0               | -2                           | 1                   | 0                   | 1       | 1     |
| I                | 0      | 1               | ·l                           | 1                   | 1                   | 0       | 1     |
| 1                | 1      | 0               | ·l                           | 1                   | 1                   | 0       | 1     |
| 1                | 1      | 1               | 0                            | 1                   | 0                   | 0       | 0     |

Table I. Modified booth encoding table

### 3.1 Modified Booth Form:

The Booth algorithm was introduced by A.D. Booth and it has some drawbacks such as (i) while designing Parallel multipliers it becomes more inconvenient as the number of add/subtract operations are variable. (ii) When there are isolated 1s the algorithm becomes inefficient. These problems can be relaxed by using Modified Booth algorithm (Radix-4). It reduces the number of partial products by half. The idea behind this technique is taking every second column and multiply  $\pm 1$ ,  $\pm 2$  or 0 instead of shifting and adding for every column of the multiplier term and multiplying by 1 or 0 to obtain the same results. Instead of grouping the bits into two at a time as in booth algorithm, modified booth algorithm groups the multiplicand bits into three at a time with overlapping technique. The Modified booth encoding table is as follows.



One<sub>j</sub>=  $Y_{2j-1} \oplus Y_{2j}$ 



#### **3.2 FAM Implementation:**

The design of the FAM is shown in Fig.3.1(b) The multiplier part is implemented using the Modified Booth algorithm. Let us consider the multiplier as X and multiplicand as Y. The value of Y=A+B. The recoded form of Y is got by giving the inputs A and B to the S-MB recoder Where A and B are added and recoded to MB form. The partial products are generated and they added along with the Correction Term (CT) in the Wallace tree Carry-Save-Adder (CSA). Then the result of the CSA is fed to the Carry-Look Ahead (CLA) adder to get the result Z=XY, i.e. Z=X(A+B).

## **3.3 Sum to Modified Booth Recoding Technique:**

Design of Signed-Bit Full Adders and Half Adders In S-MB recoding technique we recode the sum of two consecutive bits of input with two consecutive bits of input into one MB digit . Three bits are included in forming a MB digit ; the Most significant Bit (MSB) is negatively weighted and the two LSB bits are positively weighted. In order to transform the two aforementioned pair of bits in MB form we use signed bit arithmetic, which is done using signed half adder and signed full adder.

#### **IV.RESULTS:**

|                              | ule: |               |       | 24,7158 |              |          |           |
|------------------------------|------|---------------|-------|---------|--------------|----------|-----------|
| Name                         |      | <br>(21,721a) | FL.24 | 22,70 s | <b>21248</b> | E\$,78 m | 22574 m 2 |
| Minu F                       | 22   |               |       | 2       |              |          |           |
| ADR                          |      |               |       |         |              |          | _         |
| <ul> <li>All 10/2</li> </ul> | -    | _             |       |         |              |          | _         |
| 3 ×                          | e -  |               |       |         |              |          |           |
| and a state of the second    |      | _             |       |         | -            |          | _         |
| 101 (61                      | 2    |               |       | ,       |              |          |           |
|                              |      |               |       |         |              |          |           |
|                              |      |               |       |         |              |          |           |
|                              |      |               |       |         |              |          |           |

Fig 5 Top Module Resultant

#### **S** MBrecoder



The above simulation window shows the full adders result in the S M B recoder.

October 2015 Page 300



A Peer Reviewed Open Access International Journal

### **MB** result



## **V.CONCLUSION:**

The above simulation has a and b as the input for summation and x as the multiplication factor .so FAM result is shown as sum in the simulation result.a=2 and b=2 and the final result is 12 Focusing on the optimizing the design of the Fused-Add Multiply (FAM) operator. We propose a structured technique for the direct recoding of the sum of two numbers to its MB form. We explore three alternative designs of the proposed S-MB recoder and compare them to the existing ones . The proposed recoding schemes, when they are incorporated in FAMdesigns, yield considerable performance improvements in comparison with the most efficient recoding schemes found in literature.

## **REFERENCS:**

[1] A. Amaricai, M. Vladutiu, and O. Boncalo, "Design issues and implementations for floating-point divideadd fused," IEEE Trans. Circuits Syst. II– Exp. Briefs, vol. 57, no. 4, pp. 295–299, Apr. 2010.

[2] E. E. Swartzlander and H. H. M. Saleh, "FFT implementation with fused floatingpoint operations," IEEE Trans. Comput.,vol.61,no.2,pp. 284–288, Feb. 2012. [3] J.J.F.Cavanagh,Digital Computer Arithmetic. NewYork:McGraw-Hill, 1984=.

[4] S. Nikolaidis, E. Karaolis, and E. D. Kyriakis-Bitzaros, "Estimation of signal transition activity in FIR filters implemented by a MAC architecture," IEEE Trans.

[5] O. Kwon, K. Nowka, and E. E. Swartzlander, "A 16-bit by 16-bitMAC design using fast 5: 3 compressor cells," J. VLSI Signal Process. Syst.,vol. 31, no. 2, pp. 77–89, Jun. 2002.

[6] Y.-H. Seo and D.-W. Kim, "A new VLSI architecture of parallel multiplier– accumulator based on Radix-2 modified Booth algorithm," IEEE Trans. Very Large Scale Integer. (VLSI) Syst., vol. 18, no. 2, pp.201–208, Feb. 2010.

### **Author's Profile:**



Vakkala PrudhvirajYadav, Pursuing M.Tech (VLSI), Global College of Engineering & Technology, Kadapa, Andhra Pradesh, India.



A.Asha, M.Tech, Assistant Professor at Global College of Engineering & Technology, Kadapa, Andhra Pradesh, India.

October 2015 Page 301