

A Peer Reviewed Open Access International Journal

# High Performance Multi Precision Design with Dynamic Voltage Scaling Multiplier

S.Chandana M.Tech (DECS) Department of ECE Annamacharya Institute of Technology and Sciences, Tirupati, India-517520 K.Jansi Lakshmi Assistant Professor, Department of ECE Annamacharya Institute of Technology and Sciences, Tirupati, India-517520

### **N.Pushpalatha**

Assistant Professor, Department of ECE Annamacharya Institute of Technology and Sciences, Tirupati, India-517520

### ABSTRACT

In this a multi-Precision (MP) reconfigurable multiplier that joins variable precision, parallel processing (PP), Razor-based Dynamic Voltage Scaling (DVS), and MP Operands Scheduler to give perfect execution to a blend of working conditions. Most of the building pieces of the proposed reconfigurable multiplier can either work as free more diminutive precision multipliers or work in parallel to perform higher-precision increments. Given the customer's essentials (e.g., throughput), a dynamic voltage/frequency scaling unit plans the multiplier to work at the best conceivable precision and frequency. Razor flip-flop together with a dithering voltage unit organize the multiplier to achieve the slightest power usage. The single-switch dithering voltage unit and razor flip-flop help to diminish the voltage scaling level and overhead ordinarily identified with DVS to the minimum level. This low-control MP multiplier is outlined in CMOS AMIS 0.35-µm innovation. Test results demonstrate that the proposed MP configuration includes a 28.2% and 15.8% diminishment in circuit zone and force utilization contrasted and traditional fixed- width multiplier. At the point when joining this MP outline with error-tolerant razor-based DVS, and PP then the proposed novel operands scheduler, 77.7%-86.3% aggregate power decrease is accomplished with an aggregate silicon range overhead as low as 11.

*Key words*—*Computer arithmetic, dynamic voltage scaling, low power design, multi-precision multiplier.* 

**I.INTRODUCTION** 

Customer's enthusiasm for logically adaptable yet unrivaled blended media and correspondence things powers stringent confinements on the power usage of individual inward segments [1]-[4]. Of these, multipliers perform a champion amongst the regularly experienced calculating operations in digital signal processors (DSPs) [5]. For embedded applications, it has got the chance to be indispensable to arrange more power-aware multipliers [5]-[8]. Given their truly complex structure and interconnections, multipliers can demonstrate a generous number of uneven routes, achieving critical glitch period and expansion. This spurious adjusting so as to trade activity can be calmed internal courses through a blend of configuration and progression techniques. transistor-level Despite modifying internal way delays, element force decline can more over be proficient by checking the effective component extent of the data operands keeping in mind the end goal to hinder unused zones of the multiplier and/or truncate the yield thing to the detriment of lessened precision.

### **II.EXISTING SYSTEM**

Andrew Donald Booth detailed a build computation, which was named as Booth's Algorithm. In this manner, an 8-bit multiplication processed on a 32-bit Booth multiplier would bring about pointless exchanging action and power loss. A few works examined this word length advancement. Traditional DVS systems comprise principally of lookup table (LUT) and on-chip discriminating way reproduction approaches. The basic way approach regularly includes



A Peer Reviewed Open Access International Journal

an on-chip discriminating way imitation to surmise the real basic way. Subsequently, voltage could be scaled to the degree that the imitation neglects to meet the timing.



Fig. 1. Architecture of Multiplier System

# III. SYSTEM OVERVIEW AND ITS OPERATION

The proposed MP multiplier framework (Fig.1) embodies five distinct modules that are as per the following:

1) The MP multiplier;

2) The Input Operands Scheduler (IOS) whose capacity is to reorder the information stream into a cushion, subsequently to lessen the obliged power supply voltage moves.

The frequency scaling unit executed utilizing a voltage controlled oscillator (VCO). Its capacity is to create the obliged working frequency of the multiplier.
The voltage scaling unit (VSU) executed utilizing a voltage dithering unit to breaking point silicon range overhead. Its capacity is to progressively produce the supply voltage to minimize power utilization.

5) The dynamic voltage/frequency Management unit (VFMU) that gets the client necessities (e.g., throughput).

The VFMU sends control signs to the VSU and FSU to create the obliged power supply voltage and clock frequency for the MP multiplier. It is outfitted with razor flip-flop that can report timing slips identified with insufficiently high voltage supply levels. The operation standard is according to the accompanying.

At in the first place, the multiplier works at a standard supply voltage of 3.3 V. If the razor flip-flop failures of the multiplier don't report any slips, this infers that the supply voltage can be diminished. This is refined through the VFMU, which sends control signs to the VSU, accordingly to cut down the supply voltage level. Right when the data gave by the razor flip-flop shows timing passes, the scaling of the power supply is stopped. The proposed multiplier Fig. 2 merges MP and DVS and in addition parallel processing (PP).

Our multiplier contains  $8 \times 8$  bit reconfigurable multipliers. These building pieces can either go about as nine independent multipliers or work in parallel to perform one, two or three  $16 \times 16$  bit increases or a singular  $32 \times 32$  bit operation.

Fig. 3 demonstrates the advantages of the diverse methodologies being considered. Power utilization is a straight capacity of the workload, which is regularly spoken to by the info operands exactness. Bend 1 compares to the instance of an altered Fixed-width (FP) multiplier utilizing a settled power supply. Locale 1 demonstrates the force streamlining space for MP systems, which utilize distinctive accuracy duplications to lessen power. On the off chance that one consolidates MP multiplier with DVS, force is further decreased with bends (1)-(3) getting to be bends (4)–(6), individually.

Volume No: 2 (2015), Issue No: 10 (October) www.ijmetmr.com



A Peer Reviewed Open Access International Journal



Areas 1 and 2 demonstrate the force advancement space for the joined methodology. Taking into account PP, the working recurrence could be diminished together with the supply voltage, as indicated in bends (7) and(8).



Fig. 3. Conceptual view of optimization spaces of MP, DVS, and PP approaches.

At long last, locale 3 demonstrates the enhancement space for the proposed methodology, which consolidates MP, DVS with PP.

#### **IV. RECONFIGURABILITY MP MULTIPLIER:**

The structure of the interface unit, which is a sub module of the MP multiplier (Fig. 1). The part of this info interface unit is to disperse the information between the nine free handling components (PEs) (Fig. 2) of the  $32 \times 32$  bit MP multiplier, considering the chose operation mode. A control transport of 3-bit shows whether the inputs are 1/4/9 pair(s) of 8-bit operands, or 1/2/3 pair(s) of 16-bit operands, or 1 sets of 32-bit operands, individually. To assess the overhead related to reconfigurability and MP, an X and Y are the 2n-bits wide multiplicand and multiplier, separately. X<sub>H</sub>, Y<sub>H</sub> are their separate n most huge bits though X<sub>L</sub>, Y<sub>L</sub> are their individual n minimum huge bits.  $X_LY_L$ ,  $X_HY_L$ ,  $X_LY_H$ ,  $X_HY_H$  is the transversely items

The product of *X* and *Y* can be expressed as follows:

$$P = (X_H Y_H) 2^{2n} + (X_H Y_L + X_L Y_H) 2^n + X_L Y_L \qquad \dots \dots \dots (1)$$

Where 2n bit reconfigurable multiplier can be manufactured utilizing adders and four n bit  $\times$  n bit multipliers to register  $X_H Y_H$ ,  $X_H Y_L$ ,  $X_L Y_H$ , and  $X_L Y_L$ .

These outcome in overheads of 18% and 13% for the silicon zone and force, separately. Be that as it may, on the off chance that we characterize

 $X' = X_H + X_L$  ..... (2)  $Y' = Y_H + Y_L$  ..... (3)

Then (1) could be rewritten as follows:

$$P = (X_H Y_H) 2^{2n} + (X'Y' - X_H Y_H - X_L Y_L) 2^n + X_L Y_L \qquad \dots \dots (4)$$

Looking at (1) and (4), we have uprooted one  $n \times n$  bit multiplier (for figuring  $X_H Y_H$  or  $X_L Y_H$  and one 2n-bit adder (for ascertaining  $X_H Y_L + X_L Y_H$ ). The two adders are supplanted with two n-bit adders (for ascertaining

Volume No: 2 (2015), Issue No: 10 (October) www.ijmetmr.com



A Peer Reviewed Open Access International Journal

 $X_H + X_L$  and  $Y_H + Y_L$ ) and two  $2^n + 2$ )- bit subtractors (for computing  $(X'Y' - X_HY_H - X_LY_L)$ . In a 32-bit multiplier, it can essentially diminish the configuration many-sided quality by utilizing two 34-bit subtractors to replace16× 16 bit multiplier. We really require two 16× 16 bit multipliers (for computing $X_HY_H$  and  $X_LY_L$ and one 17× 17bit multiplier (for figuringX'Y').

To assess the proposed MP construction modeling, a routine 32-bit fixed width multiplier and four subblock MP multipliers are outlined utilizing a Booth Radix-4 Wallace tree structure like that utilized for the building squares of our MP three sub-block multiplier

### V. DYNAMIC VOLTAGE SCALING AND FREQUENCY SCALING MANAGEMENT A. DVS Unit

In our execution (Fig. 1), a dynamic power supply and a VCO are utilized to accomplish ongoing dynamic voltage and frequency scaling under different working conditions.

In, close ideal dynamic voltage scaling can be accomplished when utilizing voltage dithering, which displays speedier reaction time than customary voltage controllers. Voltage dithering uses power changes to associate distinctive supply Voltages to the heap, contingent upon the time spaces.

### **B. Dynamic Frequency Scaling Unit**

In the proposed  $32 \times 32$ -bit MP multiplier, dynamic frequency scaling is utilized to meet throughput necessities. It is in view of a VCO actualized as a seven-stage current starved ring oscillator. The yield frequency of VCO can be tuned from 5 to 50 MHz utilizing four control bits (5 MHz/step). This scope of frequency is chosen to meet the necessities of universally useful DSP applications.

It is likewise utilized for decreasing warmth as a part of inadequately cooled frameworks when the temperature achieves a certain edge, for example, in ineffectively cooled over timed frameworks .It is otherwise called CPU throttling. It is a strategy in PC building design whereby the frequency of a microchip can be naturally balanced "on the fly," either to ration power or to lessen the measure of warmth created by the chip.

Dynamic frequency scaling is ordinarily utilized as a part of portable workstations and other cell phones, where vitality originates from a battery and subsequently is restricted. It is additionally utilized as a part of calm registering settings and to reduction vitality and cooling expenses for softly stacked machines. Less warmth yield, thus, permits the framework chilling fans to be throttled off or killed, diminishing clamor levels and further diminishing force utilization. It is likewise utilized for decreasing warmth as a part of inadequately cooled frameworks when the temperature achieves a certain edge, for example. in ineffectively cooled over timed frameworks

### VI .INPUT OPERANDS SCHEDULER

The information operands scheduler which improves the data information and consequently decrease the supply voltage move, along these lines power utilization will be lessened. It comprises of extent locator, cushion (RAM), and a voltage and recurrence analyzer. This assistance to improve the data and recognize the accuracy and send to MP multiplier. Here proposed an IOS that will perform the accompanying undertakings:

1) Reorder the info information stream so that sameexactness operands are assembled together into a support and

2) Takes the base supply and frequency from the LUT in Input Operand Scheduler.

The operation of multiplier is controlled by two outer signs .i.e. working frequency and voltage signal. These two signs are tuned to right values relying upon the genuine workload i.e. it relies on upon the info operands. The recreation is finished by utilizing giving data operands and contrasting e results and a PC that



A Peer Reviewed Open Access International Journal

gives genuine results. Furthermore timing is checked. The exactness information increase incorporates information word length up to 32-bits.

### **VII.SIMULATION RESULTS**

Fixed width multiplier  $32 \times 32$ -bit existing multiplier simulation results is demonstrated in the figure.4 and the proposed Multi-Precision multiplier simulation yield results is indicated in Fig.5

| Current Simulation<br>Time: 1000 ns |          | 500 ns 550 ns     600 ns     650 ns     700 ns     750 ns     800 ns     850 ns     900 ns |  |  |  |  |
|-------------------------------------|----------|--------------------------------------------------------------------------------------------|--|--|--|--|
| 🛚 👧 s(63:0)                         | 6        | 64hv281C16C97D49948                                                                        |  |  |  |  |
| ₽ <mark>≬1</mark> a[31:0]           | 3        | 32112340087                                                                                |  |  |  |  |
| ₽ <mark>61</mark> b(31:0)           | <u>}</u> | 321/23415678                                                                               |  |  |  |  |
|                                     |          |                                                                                            |  |  |  |  |

Fig:4 Fixed width multiplier simulation results

| Current Simulation<br>Time: 1000 ns |   | 500 ns 550 ns     600 ns     650 ns    700 ns    750 ns    800 ns    850 ns    900 ns    950 |  |  |  |  |
|-------------------------------------|---|----------------------------------------------------------------------------------------------|--|--|--|--|
| 🛙 👌 (63:0)                          | Ó | 64140148669896878606                                                                         |  |  |  |  |
| 🛙 👩 a[31:0]                         | 3 | 32h12345623                                                                                  |  |  |  |  |
| 🛙 👌 b[31:0]                         | 3 | 32h12345342                                                                                  |  |  |  |  |
|                                     |   |                                                                                              |  |  |  |  |

Fig:5 Multi-Precision Multiplier simulations

| Table:1  | Comparison | between | Existing | and |
|----------|------------|---------|----------|-----|
| proposed |            |         |          |     |

| Design                            | Power(mw | Area               | Delay  |
|-----------------------------------|----------|--------------------|--------|
|                                   | ,        | (mm <sup>2</sup> ) | (nS)   |
| Fixed<br>width<br>multiplier      | 25       | 0.75               | 41.766 |
| Multi-<br>Precision<br>Multiplier | 19       | 0.67               | 26.997 |

#### **VIII CONCLUSION**

The proposed a novel MP multiplier construction modeling including, separately, 28.2% and 15.8% lessening in silicon area and power utilization contrasted and its  $32 \times 32$  bit conventional Fixed-width multiplier partner. At the point when incorporating this MP multiplier building design with a error tolerant razor-based DVS methodology and the proposed novel operands scheduler, 77.7%–86.3% aggregate power decrease was accomplished with an aggregate silicon zone overhead as low as 11.1%.

The created chip showed run-time adjustment to the genuine workload by working at the dynamic supply voltage level and least clock frequency while meeting throughput necessities. The proposed novel devoted operand scheduler revamps operations on data operands, subsequently to decrease the quantity of moves of the supply voltage and thusly, minimized the general power utilization of the multiplier. The proposed MP razor-based DVS multiplier gave an answer toward accomplishing full computational adaptability and low power utilization for different broadly useful low-control applications.



A Peer Reviewed Open Access International Journal

#### **REFERENCES**

[1] Xiaoxiao Zhang, and Amine Bermak,"32 Bit×32 Bit Multiprecision Razor-Based Dynamic Voltage Scaling Multiplier with Operands Scheduler", IEEE Trans. Very Large ScaleIntegr. (VLSI) Syst., vol. 22, no. 4, Apr 2014.

[2] R. Min, M. Bhardwaj, S.-H. Cho, N. Ickes, E. Shih, A. Sinha, A. Wang, and A. Chandrakasan, "Energycentric enabling technologies for wireless sensor networks," IEEE Wirel. Commun., vol. 9, no. 4, pp. 28–39, Aug. 2002.

[3] M. Bhardwaj, R. Min, and A. Chandrakasan, "Quantifying and enhancing power awareness of VLSI systems," IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 9, no. 6, pp. 757–772, Dec. 2001.

[4] A. Wang and A. Chandrakasan, "Energy-aware architectures for a realvalued FFT implementation," in Proc. EEE Int. Symp. Low Power Electron. Design, Aug. 2003, pp. 360–365.

[5] T. Kuroda, "Low power CMOS digital design for multimedia processors," in Proc. Int. Conf. VLSI CAD, Oct. 1999, pp. 359–367.

[6] H. Lee, "A power-aware scalable pipelined booth multiplier," in Proc. IEEE Int. SOC Conf., Sep. 2004, pp. 123–126.

[7] S.-R. Kuang and J.-P. Wang, "Design of powerefficient configurable booth multiplier," IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 57, no. 3, pp. 568– 580, Mar. 2010.

[8] O. A. Pfander, R. Hacker, and H.-J. Pfleiderer, "A multiplexer-based concept for reconfigurable multiplier arrays," in Proc. Int. Conf. Field Program. Logic Appl., vol. 3203. Sep. 2004, pp. 938–942.

[9] F. Carbognani, F. Buergin, N. Felber, H. Kaeslin, and W. Fichtner, "Transmission gates combined with

level-restoring CMOS gates reduce glitches in low-Large Scale Integr. (VLSI) Syst., vol. 16, no. 7, pp. 830–836, Jul. 2008.

[10] T. Yamanaka and V. G. Moshnyaga, "Reducing multiplier energy by data-driven voltage variation," in Proc. IEEE Int. Symp. Circuits Syst., May 2004, pp. 285–288.

[11] W. Ling and Y. Savaria, "Variable-precision multiplier for equalizer with adaptive modulation," in Proc. 47th Midwest Symp. Circuits Syst., vol. 1. Jul. 2004, pp. I-553–I-556.

[12] K.-S. Chong, B.-H. Gwee, and J. S. Chang, "A micropower low-voltage multiplier with reduced spurious switching," IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 13, no. 2, pp. 255–265, Feb. 2005.

[13] M. Sjalander, M. Drazdziulis, P. Larsson-Edefors, and H. Eriksson, "A low-leakage twin-precision multiplier using reconfigurable power gating," in Proc. IEEE Int. Symp. Circuits Syst., May 2005, pp. 1654– 1657.

### **ABOUT AUTHORS**



**S. Chandana** received the B.Tech Degree in E.C.E from Vemu Engineering College P.Kothakota, India in 2013.She is pursuing her M.Tech Degree at Annamacharya Institute of Technology and Sciences (AITS) Tirupati. Her area of interest in Digital Image Processing.



A Peer Reviewed Open Access International Journal



K. Jansi Lakshmi received B.Tech in Electronics and Communication Engineering from JNTU, Hyderabad in 2010. M.Tech (VLSI System Design) from JNTU, Ananthapur in 2012. She is working as an Assistant Professor in Annamacharya Institute of Technology and Sciences, Tirupati. Her research area of interest in VLSI System Design, Communication systems.



**N. Pushpalatha** completed her B.Tech at JNTU, Hyderabad in 2004 and M.Tech at A.I.T.S., Rajampet in 2007. Presently she is working as Assistant Professor of ECE, Annamacharya Institute of Technology and Sciences Tirupati since 2006. She has guided many B.Tech projects and M.Tech Projects. Her Research area includes Data Communications and Ad-hoc Wireless Sensor Networks.