

A Peer Reviewed Open Access International Journal

Network on Chip (NOC) Data Encoding Techniques Using VHDL

**B.Santosh Kumar** 

Assistant Professor, Dept of ECE, Narasimha Reddy Engineering College. Dr.N.Murali Mohan HOD,

Dept of ECE.

Narasimha Reddy Engineering

College.

**P.Victoria Rani** 

M.Tech (VLSI), Narasimha Reddy Engineering College.

### **ABSTRACT:**

As technology shrinks, the power dissipated by the links of a network-on-hip (NoC) starts to compete with the power dissipated by the other elements of the communication subsystem, namely, the routers and the network interfaces (NIs). In this paper, we present a set of data encoding schemes aimed at reducing the power dissipated by the links of an NoC. The proposed schemes are general and transparent with respect to the underlying NoC fabric (i.e., their application does not require any modification of the routers and link architecture). Experiments carried out on both synthetic and real traffic scenarios show the effectiveness of the proposed schemes, which allow saving upto 51% of power dissipation and 14% of energy consumption without any significant performance degradation and with less than 15% area overhead in the NI. In this paper we used Xilinx-ISE tool for logical verification, and further synthesizing it on Xilinx -ISE tool using target technology and performing placing & routing operation for system verification.

#### **Keywords:**

Computer arithmetic, multiplication by constants, common sub expressions sharing, Add-Multiply operation, arithmetic circuits, Modified Booth recoding, VLSI design.

### I. INTRODUCTION:

The advances in fabrication technology allow designers to implement a whole system on a single chip, but the inherent design complexity of such systems makes it hard to fully explore the technology potential. Thus, the design of Systems-on-Chip (SoCs) is usually based on the reuse of predesigned and pre-verified intellectual property core that are interconnected through special communication resources that must handle very tight performance and area constraints. In addition to those application-related constraints, deep submicron effects pose physical design challenges for long wires and global on-chip communication. A possible approach to overcome those challenges is to change from a fully synchronous design paradigm to a globally asynchronous, locally synchronous (GALS) design paradigm. A Network on-Chip (NoC) is an infrastructure essentially composed of routers interconnected by communication channels. It is suitable to support the GALS paradigm, since it provides asynchronous communication, scalability, reusability and reliability.

The growing market for portable battery-powered devices adds a new dimension, power, to the VLSI design space, previously characterized by speed and area. Power consumption is directly related to battery life as well as costly package and heatsink requirements for high-end devices. In order to ensure the final system complies to the desired function, thermal and cost requirements, the power consumption issues must be addressed during the design of all subsystems in a SoC, including the interconnect structure. One problem related to power consumption in busses is the capacitances induced by long wires. Such problem is minimized in NoCs, since point-topoint short wires are used between routers. However, NoCs consumes power in routers, diminishing the apparent advantage in terms of power when compared to busses.



A Peer Reviewed Open Access International Journal

The power consumption in a NoC grows linearly with the amount of bit transitions in subsequent data packets sent through the interconnect architecture. One way to reduce power consumption in NoCs, in both wires and logic, is to reduce the switching activity by means of coding schemes. Several schemes were proposed in the late 90's, all of them addressing busbased communication architectures. The contribution of this work are two: (i) the evaluation of coding schemes in the context of NoC-based systems and the trade-off analysis of the power savings obtained by the application of such coding schemes versus the power consumption overhead due to the additional encoding and decoding circuitry, and (ii) the proposal of a new coding scheme suitable for NoC-based systems.

In the binary number system the digits, called bits, are limited to the set. The result of multiplying any binary number by a single binary bit is either 0, or the original number. This makes forming the intermediate partialproducts simple and efficient. Summing these partialproducts is the time consuming task for binary multipliers. One logical approach is to form the partial-products one at a time and sum them as they are generated. Often implemented by software on processors that do not have a hardware multiplier, this technique works fine, but is slow because at least one machine cycle is required to sum each additional partial-product. For applications where this approach does not provide enough performance, multipliers can be implemented directly in hardware.

#### Power Consumption in Modern VLSI Systems:

Power consumption is becoming a crucial factor in the design of high-speed digital systems, [1, 4, 5, 8, 12, 20, 21, 38, 44]. Whereas static power consumption is due to leakage and short-circuit currents, dynamic power consumption stems from switching activity, i.e., bit transitions. Interconnects consume the lion's share of dynamic power in modern chips. For example, studies show that interconnect links consume up to 60% of the dynamic power in NoCs [1,42], more than 60% of the dynamic power in a modern microprocessor [21],and

more than 90% in FPGA [15]. This portion is apparently growing [1, 9,12, 33, 38, 44].

### **Structure and Contributions:**

The first problem we tackle in this thesis is to reduce the number of redundant bit transmissions resulting from error detection. A solution to this problem is given in Chapter 2. Our analysis shows that, for example, on a4x4 NoC with one redundant parity bit, our technique reduces the redundant information transmitted by 75%, and the savings increase asymptoticallyto100% with the size of the NoC.Our second goal in this thesis is to develop an approach for reducing the number of bit transitions in No interconnects links.. Analysis and simulations of our technique demonstrate reduction of up to 55% in the number of bit transitions and up to 40% savings in power consumed on the link.

In the next several years, the availability of chips with 1000 cores is foreseen. In these chips, a significant fraction of the total system power budget is dissipated by interconnection networks. Therefore, the design of power-efficient interconnection networks has been the focus of many works published in the literature dealing with NoC architectures. These works concentrate on different components of the interconnection networks such as routers, NIs, and links. Since the focus of this paper is on reducing the power dissipated by the links, in this section, we briefly review some of the works in the area of link power reduction. These include the techniques that make use of shielding, increasing line-to-line spacing, and repeater insertion.

They all increase the chip area. The data encoding scheme is another method that was employed to reduce the link power dissipation. The data encoding techniques may be classified into two categories. In the first category, encoding techniques concentrate on lowering the power due to self-switching activity of individual bus lines while ignoring the power dissipation owing to their coupling switching activity.



A Peer Reviewed Open Access International Journal

In this category, bus invert (BI) and INC-XOR have been proposed for the case that random data patterns are transmitted via these lines. On the other hand, gray code, working-zone encoding, and TO-XOR were suggested for the case of correlated data patterns. Application-specific approaches have also been proposed. This category of encoding is not suitable to be applied in the deep submicron meter technology nodes where the coupling capacitance constitutes a major part of the total interconnect capacitance. This causes the power consumption due to the coupling switching activity to become a large fraction of the link power consumption, making total the aforementioned techniques, which ignore such contributions, inefficient. The works in the second category concentrate on reducing power dissipation through the reduction of the coupling switching. Among these schemes the switching activity is reduced using many extra control lines. For example, the data bus width grows from 32 to 55. The techniques proposed have a smaller number of control lines but the complexity of their decoding logic is high.

The technique described in is as follows: first, the data are both odd inverted and even inverted, and then transmission is performed using the kind of inversion which reduces more the switching activity. In the coupling switching activity is reduced up to 39%. In this paper, compared to, we use a simpler decoder while achieving a higher activity reduction. Let us now discuss in more detail the works with which we compare our proposed schemes. The number of transitions from 0 to 1 for two consecutive flits (the flit that just traversed and the one which is about to traverse the link) is counted. If the number is larger than half of the link width, the inversion will be performed to reduce the number of 0 to 1 transitions when the flit is transferred via the link. This technique is only concerned about the self-switching without worrying the coupling switching.

Note that the coupling capacitance in the state-of-theart silicon technology is considerably larger (e.g., four times)compared with the self-capacitance, and hence, should be considered in any scheme proposed for the link power reduction.

### **PROPOSED ENCODING SCHEMES:**

We present the proposed encoding scheme whose goal is to reduce power dissipation by minimizing the coupling transition activities on the links of the interconnection network. Let us first describe the power model that contains different components of power dissipation of a link. The dynamic power dissipated by the interconnects and drivers is

$$P = [T_{0\to 1} (C_s + C_l) + T_c C_c] V_{dd}^2 F_{ck}$$
(1)

Where  $T0 \rightarrow 1$  is the number of  $0 \rightarrow 1$  transitions in the bus in two consecutive transmissions, Tc is the number of correlated switching between physically adjacent lines, Cs is the line to substrate capacitance, C lis the load capacitance, Cc is the coupling capacitance, Vdd is the supply voltage, and Fck is the clock frequency. One can classify four types of coupling transitions as described in [26]. A Type I transition occurs when one of the lines switches when the other remains unchanged. In a Type II transition, one line switches from low to high while the other makes transition from high to low. A Type III transition corresponds to the case where both lines switch simultaneously. Finally, in a Type IV transition both lines do not change.

The effective switched capacitance varies from type to type, and hence, the coupling transition activity,Tc, is a weighted sum of different types of coupling transition contribution.Therefore

 $Tc = K1T1 + K2T2 + K3T3 + K4T4 \dots (2)$ 

Where Ti is the average number of Type i transition and Ki is its corresponding weight. According to [26], we useK1 =1, K2 =2, and K3 =K4 =0. The occurrence probability of Types I and II for a random set of data is1/2 and 1/8, respectively. This leads to a higher value forK1T1 compared with K2T2 suggesting that minimizing the number of Type I transition may lead



A Peer Reviewed Open Access International Journal

to a considerable power reduction. Using (2), one may express (1) as

$$P = [T_{0\to 1} (C_s + C_l) + (T_1 + 2T_2) C_c] V_{dd}^2 F_{ck}.$$
 (3)

According to [3],Cl can be neglected

$$P \propto T_{0 \to 1} C_s + (T_1 + 2T_2) C_c.$$
 (4)

### A. Scheme I:

The scheme compares the current data with the previous one to decide whether odd inversion or no inversion of the current data can lead to the link power reduction.

1) Power Model: If the flit is odd inverted before being transmitted, the dynamic power on the link is

$$P' \propto T'_{0 \to 1} + \left(K_1 T'_1 + K_2 T'_2 + K_3 T'_3 + K_4 T'_4\right) C_c \quad (5)$$

Fig.1. Encoder architecture scheme I. (a) Circuit diagram. (b) Internal view of the encoder block (E) Also, since  $T0 \rightarrow 1=T0 \rightarrow 1(odd)+T0 \rightarrow 1(even)$ , one may write





which is the exact condition to be used to decide whether the odd invert has to be performed. Since the termsT0 $\rightarrow$ 1(odd)and T0 $\rightarrow$ 0(odd) are weighted with a factor of 1/4, for link widths greater than 16 bits, the misprediction of the invert condition will not exceed 1.2% on average. Thus, we can approximate the exact condition as

$$T_1 + 2T_2 > T_2 + T_3 + T_4 + 2T_1^{***}.$$
(8)

Of course, the use of the approximated odd invert condition reduces the effectiveness of the encoding scheme due to the error induced by the approximation but it simplifies the hardware implementation of encoder. Now, defining

 $T_{\rm x} = T_3 + T_4 + T_1^{***}$ 

and

$$T_{\rm v} = T_2 + T_1 - T_1^{***} \tag{9}$$

one can rewrite (8) as

$$T_y > T_x$$
. (10)

Assuming the link width of w bits, the total transition between adjacent lines is w - 1, and hence

$$T_y + T_x = w - 1.$$
 (11)

Thus, we can write (10) as

$$T_y > \frac{(w-1)}{2}$$
. (12)



A Peer Reviewed Open Access International Journal

This presents the condition used to determine whether the odd inversion has to be performed or not. Proposed Encoding Architecture: The proposed encoding architecture, which is based on the odd invert condition defined by (12), is shown in Fig. 1. We consider a link width of wits. If no encoding is used, the body flits are grouped in whits by the NI and are transmitted via the link. In our approach, one bit of the link is used for the inversion bit, which indicates if the flit traversing the link has been inverted or not. More specifically, the NI packs the body flits inw-1 bits [Fig. 1(a)]. The encoding logic E, which is integrated into the NI, is responsible for deciding if the inversion should take place and performing the inversion if needed. The generic block diagram shown in Fig. 1(a) is the same for all three encoding schemes proposed in this paper and only the block E is different for the schemes. To make the decision, the previously encoded flit is compared with the current flit being transmitted. This latter, whose bits are the concatenation of w-1 payload bits and a"0" bit, represents the first input of the encoder, while the previous encoded flit represents the second input of the encoder [Fig. 1(b)].

Thew-1 bits of the incoming (previous encoded) body flit are indicated by Xi (Yi), i =0,1,...,w-2. The wth bit of the previously encoded body flit is indicated by inv which shows if it was inverted (inv =1) r left as it was (inv=0). In the encoding logic, each Ty block takes the two adjacent bits of the input flits(e.g., X1X2Y1Y2, X2X3Y2Y3, X3X4Y3Y4,etc.) and sets its output to "1" if any of the transition types ofTy is detected. This means that the odd inverting for this pair of bits leads to the reduction of the link power dissipation (Table I). The Ty block may be implemented using a simple circuit. The second stage of the encoder, which is a majority voter block, determines if the condition given in (12) is satisfied(a higher number of 1s in the input of the block compared to 0s). If this condition is satisfied, in the last stage, the inversion is performed on odd bits.

The decoder circuit simply inverts the received flit when the inversion bit is high.

#### **B. Scheme II:**

In the proposed encoding scheme II, we make use of both odd (as discussed previously) and full inversion. The full inversion operation converts Type II transitions to Type IV transitions. The scheme compares the current data with the previous one to decide whether the odd, full, or no inversion of the current data can give rise to the link power reduction.

#### 1) Power Model:

Let us indicate with P,P', and P'' the power dissipated by the link when the flit is transmitted with no inversion, odd inversion, and full inversion, respectively. The odd inversion leads to power reduction when P'<P'' and P'<P. The power P'' is given by

$$P'' \propto T_1 + 2T_4^{**}.$$
 (13)

Neglecting the self-switching activity, we obtain the condition P' < P'' as [see (7) and (13)]

$$T_2 + T_3 + T_4 + 2T_1^{***} < T_1 + 2T_4^{**}.$$
 (14)

Therefore, using (9) and (11), we can write

$$2(T_2 - T_4^{**}) < 2T_y - w + 1.$$
(15)





A Peer Reviewed Open Access International Journal

Based on (12) and (15), the odd inversion condition is obtained as

$$2(T_2 - T_4^{**}) < 2T_y - w + 1 \quad T_y > \frac{(w - 1)}{2}.$$
 (16)

Similarly, the condition for the full inversion is obtained from P'' < P and P'' < P'. The inequality P'' < P is satisfied when [23]

$$T_2 > T_4^{**}$$
. (17)

Therefore, using (15) and (17), the full inversion condition is obtained as

$$2(T_2 - T_4^{**}) > 2T_y - w + 1 \quad T_2 > T_4^{**}.$$
 (18)

When none of (16) or (18) is satisfied, no inversion will be performed.

### 2) Proposed Encoding Architecture:

The operating principles of this encoder are similar to those of the encoder implementing Scheme I. The proposed encoding architecture, which is based on the odd invert condition of (16) and the full invert condition of (18), is shown in Fig. 2. Here again, the wth bit of the previously and the full invert condition of (18)is shown in Fig. 2. Here again, the wth bit of the previously encoded body flit is indicated with inv which defines if it was odd or full inverted (inv=1) or left as it was (inv=0).In this encoder, in addition to the Ty block in the Scheme I encoder, we have theT2 andT\*\*4blocks which determine ifthe inversion based on the transition types T2andT\*\*4shouldbe taken place for the link power reduction.

The second stage formed by a set of 1s blocks which count the number of 1s in their inputs. The output of these blocks has the width of log 2w. The output of the top 1s block determines the number of transitions that odd inverting of pair bits leads to the link power reduction. The middle 1s block identifies the number of transitions whose full inverting of pair bits leads to the link power reduction. Finally, the bottom 1s block specifies the number of transitions whose full inverting of pair bits leads to the increased link power. Based on the number of 1s for each transition type, Module A decides if an odd invert or fullinvert action should be performed for the power reduction.



# Fig. 3.Decoder architecture SchemeII. (a) Circuit diagram. (b) Internal view of the decoder block (D)

For this module, if (16) or (18) is satisfied, the corresponding output signal will become "1." In case no invert action should be taken place, none of the output is set to "1." Module A can be implemented using full-adder and comparator blocks. The circuit diagram of the decoder is shown in Fig. 3. The wbits of the incoming (previous) body flit are indicated by Zi (Ri), i =0,1,...,w-1. The wth bit of the body flit is indicated by inv which shows if it was inverted (inv =1) or left as it was (inv =0). For the decoder, we only need to have the Ty block to determine which action has been taken place in the encoder. Based on the outputs of these blocks, the majority voter block checks the validity of the inequality given by (12). If the output is "0" ("1") and the inv=1,it means that half (full) inversion of the bits has been performed. Using this output and the logical gates, the inversion action is determined. If two inversion bits were used, the overhead of the decoder hardware could be substantially reduced.



A Peer Reviewed Open Access International Journal

### C. Scheme III:

In the proposed encoding Scheme III, we add even inversion to Scheme II. The reason is that odd inversion converts some of Type I (T\*\*\*1) transitions to Type II transitions. As can be observed from Table II, if the flit is even inverted, the transitions indicated as T\*\*1/T\*\*\*1in the table are converted to Type IV/Type III transitions. Therefore, the even inversion may reduce the link power dissipation as well. The scheme compares the current data with the previous one to decidewhether odd, even, full, or no inversion of the current datacan give rise to the link power reduction.

 Power Model: Let us indicate with P', P'', and P''' he power dissipated by the link when the flit is transmitted with no inversion, odd inversion, full inversion, and even inversion, respectively. Similar to the analysis given for Scheme I, we can approximate the condition P'''<Pas</li>

$$T_1 + 2T_2 > T_2 + T_3 + T_4 + 2T_1^*.$$
<sup>(19)</sup>

Defining

$$T_e = T_2 + T_1 - T_1^*$$
 (20)

we obtain the condition 
$$P'' < P$$
 as

$$T_e > \frac{(w-1)}{2}$$
. (21)

Similar to the analysis given for scheme II, we can approximate the condition P'' < P' as

$$T_2 + T_3 + T_4 + 2T_1^* < T_2 + T_3 + T_4 + 2T_1^{***}.$$
 (22)

Using (9) and (20), we can rewrite (22) as

$$T_e > T_y$$
. (23)

Also, we obtain the condition P'' < P'' as [see (13) and (19)]

$$T_2 + T_3 + T_4 + 2T_1^* < T_1 + 2T_4^{**}$$
 (24)

Now, define

$$T_r = T_3 + T_4 + T_1$$

and

$$T_e = T_2 + T_1 - T_1^*$$
. (25)

Assuming the link width of w bits, the total transition between adjacent lines is w - 1, and hence

$$T_e + T_r = w - 1.$$
 (26)

Using (26), we can rewrite (24) as

$$2(T_2 - T_4^{**}) < 2T_e - w + 1.$$
<sup>(27)</sup>

The even inversion leads to power reduction when P''' < P, P'''' < P', and P''' < P''. Based on (21), (23), and (27), we obtain

$$T_e > \frac{(w-1)}{2}, \quad T_e > T_y, \quad 2(T_2 - T_4^{**}) < 2T_e - w + 1.$$
(28)

The full inversion leads to power reduction when P'' < P, P'' < P', and  $P'' < P^{'''}$ . Therefore, using (18) and (27), the full inversion condition is obtained as

$$2(T_2 - T_4^{**}) > 2T_y - w + 1, \quad (T_2 > T_4^{**})$$
  
$$2(T_2 - T_4^{**}) > 2T_e - w + 1. \tag{29}$$



Fig. 4. Encoder architecture

Similarly, the condition for the odd inversion is obtained fromP'<P,P'<P'' ,andP'<P'''. Based on (16) and (23), theodd inversion condition is satisfied when

$$2(T_2 - T_4^{**}) < 2T_y - w + 1, \quad T_y > \frac{(w-1)}{2}$$
  
 $T_e < T_y.$  (30)

When none of (28), (29), or (30) is satisfied, no inversion will be performed.2) Proposed Encoding Architecture:



A Peer Reviewed Open Access International Journal

The operating principles of this encoder are similar to those of the encoders implementing Schemes I and II. The proposed encoding architecture, which is based on the even invert condition of (28), the full invert condition of (29), and the odd invert condition of (30), is shown in Fig. 4. The wth bit of the previously encoded body flit is indicated by inv which shows if it was even, odd, or full inverted (inv=1) or left as it was (inv=0).The first stage of the encoder determines the transition types while the second stage is formed by a set of 1s blocks which count the number of ones in their inputs. In the first stage, we have added the Te blocks which determine if any of the transition types of  $T2,T_1**,andT_1***is$  detected for each pair bits of their inputs. For these transition types, the even invert action yields link power reduction. Again, we have four Ones blocks to determine the number of detected transitions for eachTy,Te,T<sub>2</sub>,T<sub>4</sub>\*\*, blocks. The output of the Ones blocks are inputs for Module C. This module determines if odd, even, full, or no invert action corresponding to the outputs "10,""01," "11," or "00," respectively, should be performed. The outputs "01," "11," and "10" show that whether (28), (29),and (30), respectively, are satisfied.

#### **V. SIMULATION RESULTS:**

The simulation of the program is done using Model Sim tool and Xilinx ISE Design Suite 14.2. The results for the Proposed Encoding Architecture: proposed schemes are general and transparent with respect to the underlying NoC fabri. The simulation results Scheme 1, number of 4 inputs LUT and IOBs are shown in Table III. The simulation results scheme 2, scheme 3 of number of occupied slices, number of 4 inputs LUT and IOBs are shown in Table IV.

#### A. Using ModelSim





Figure5. scheme 1

#### 2) 8x8 bit



Figure 6. scheme 2



Figure 7 scheme 3

TABLE III. XILINX RESULTS FOR 4x4 BIT MODIFIED BOOTH MULTIPLIER

| PARAMETER                 | Used | Available | Utilization |
|---------------------------|------|-----------|-------------|
| Number of 4 input LUTs    | 25   | 138240    | 1%          |
| Number of occupied Slices | 9    | 34560     | 1%          |
| Number of bonded IOBs     | 16   | 800       | 2%          |

TABLE IV. XILINX RESULTS FOR 8x8 BIT MODIFIED BOOTH MULTIPLIER

| PARAMETER          | Used | Available | Utilization |
|--------------------|------|-----------|-------------|
| Number of 4 input  |      |           |             |
| LUTs               | 153  | 63400     | 1%          |
| Number of occupied |      |           |             |
| Slices             | 43   | 15850     | 1%          |
| Number of bonded   |      |           |             |
| IOBs               | 32   | 210       | 15%         |



A Peer Reviewed Open Access International Journal

#### **CONCLUSION:**

In this paper, we have presented a set of new data encoding schemes aimed at reducing the power dissipated by the links of an NoC. In fact, links are responsible for a significant fraction of the overall power dissipated by the communication system. In addition, their contribution is expected to increase in future technology nodes. As compared to the previous encoding schemes proposed in the literature, the rationale behind the proposed schemes is to minimize not only the switching activity, but also (and in particular) the coupling switching activity which is mainly responsible for link power dissipation in the deep sub micrometer technology regime. The proposed encoding schemes are agnostic with respect to the underlying NoC architecture in the sense that their application does not require any modification neither in the routers nor in the links. An extensive evaluation has been carried out to assess the impact of the encoder and decoder logic in the NI. The encoders implementing the proposed schemes have been assessed in terms of power dissipation and silicon area. The impacts on the performance, power, and energy metrics have been studied using a cycle- and bit accurate NoC simulator under both synthetic and real traffic scenarios. Overall, the application of the proposed encoding.

#### **REFERENCES:**

[1]International Technology Roadmap for Semiconductors. (2011) [Online].Available: http://www.itrs.net

[2] M. S. Rahaman and M. H. Chowdhury, "Crosstalk avoidance and errorcorrection coding for coupled RLC interconnects," inProc. IEEE Int.Symp. Circuits Syst., May 2009, pp. 141–144.

[3] W. Wolf, A. A. Jerraya, and G. Martin, "Multiprocessor system-on-chip MPSoC technology," IEEE Trans. Comput.-Aided Design Integr. Circuits Syst., vol. 27, no. 10, pp. 1701–1713, Oct. 2008. [4] L. Benini and G. De Micheli, "Networks on chips: A new SoCparadigm," Computer, vol. 35, no. 1, pp. 70–78, Jan. 2002.

[5] S. E. Lee and N. Bagherzadeh, "A variable frequency link for a poweraware network-on-chip (NoC)," Integr. VLSI J., vol. 42, no. 4,pp. 479–485, Sep. 2009.

[6].Yeh,L.S.Peh,S.Borkar,J.Darringer,A.Agarwal,and W.M.Hwu, "Thousand-core chips roundtable," IEEE Design Test Comput., vol. 25,no. 3, pp. 272–278, May–Jun. 2008.

[7] A. Vittal and M. Marek-Sadowska, "Crosstalk reduction for VLSI,"IEEE Trans. Comput.-Aided Design Integr. Circuits Syst., vol. 16, no. 3,pp. 290–298, Mar. 1997.

[8] M. Ghoneima, Y. I. Ismail, M. M. Khellah, J. W. Tschanz, and V. De, "Formal derivation of optimal active shielding for low-power on-chipbuses," IEEE Trans. Comput.-Aided Design Integr. Circuits Syst., vol. 25,no. 5, pp. 821–836, May 2006.

[9] L. Macchiarulo, E. Macii, and M. Poncino, "Wire placement forcrosstalk energy minimization in address buses," inProc. Design Autom.Test Eur. Conf. Exhibit., Mar. 2002, pp. 158–162.

[10] R. Ayoub and A. Orailoglu, "A unified transformational approach forreductions in fault vulnerability, power, and crosstalk noise and delayon processor buses," inProc. Design Autom. Conf. Asia South Pacific,vol. 2. Jan. 2005, pp. 729–734.