

# ISSN No: 2348-4845 International Journal & Magazine of Engineering, Technology, Management and Research

A Peer Reviewed Open Access International Journal

# VLSI Architecture Design for Image Compression Using Modified DCT

# E.Divya Teja

PG Scholar, Dept of ECE, Global College of Engineering & Technology, Kadapa, YSR (Dt), AP, India.

# **ABSTRACT:**

DISCRETE COSINE TRANSFORM (DCT) is a widely used transform engine for image and video compression applications. In recent years, the development of visual media has been progressed towards high-resolution specifications, such as high definition television (HDTV)..

The proposed spatial scheduling strategy includes the ability to choose the proposed methodology precision bit lengthi.e 8 bit , a hardware sharing architecture that reduces the hardware cost, and the proposed time scheduling strategy arranges different dimensional computations in that it can calculate first-dimensional and second-dimensional transformations simultaneously with hardware utilization of 100%.

A multiplierless methodology is chosen by replacing the multiplier with adders is based on test image simulations. In addition, the proposed hardware sharing architecture employs a binary signed-digit architecture that enables the arithmetic resources to be shared for all blocks of image implementation.

Tags: DCT transform, image compression.

# **I.INTRODUCTION:**

The growth of multimedia technology over the past decades demanded the increased use of digital information. The advances in technology have made the use of digital images prevalent to a large extent. Digital images comprised of large amount of data. Reduction in the size of the image data for both storing and transmission of digital images are becoming increasingly important as they find more applications. Image compression maps from a high dimensional space to a low dimensional space.

# Y.Maheswar Reddy

Associate Professor, Dept of ECE, Global College of Engineering & Technology, Kadapa, YSR (Dt), AP, India.

The main aim of the compression of image is to represent an image with minimum number of bits with an acceptable anality of image. Transformation is a very useful tool in image compression. It transforms the image data in time domain to frequency domain. By transforming the data into frequency domain, the spatial redundancy in the time domain can be minimized.

The energy of the transformed data is mainly condensed in low frequency region; therefore image can be represented by a few transform coefficients by discarding most of these coefficients without significantly affecting the reconstructed image quality.

# II. RELATED WORK: 2D DCT Algorithm:

The 2D DCT architecture uses the row—column distributed arithmetic version of the Chen fast DCT algorithm [2]. The first step of the Chen algorithm is a factorization of the DCT-I matrix such that the subsequent computation of the even indexed coefficients are fully separated from the computation of the odd indexed coefficients. The 1D DCT coefficients Xk, k=0,1, ..., 7 for an 8-point input vector xn, n=0,1, ..., 7 can be expressed as follows:

$$\begin{bmatrix} X 0 \\ X 2 \\ X 4 \\ X 6 \end{bmatrix} = \begin{bmatrix} A & A & A & A \\ B & C & -C & -B \\ A & -A & -A & A \\ C & -B & -B & C \end{bmatrix} \begin{bmatrix} x 0 + x7 \\ x1 + x6 \\ x2 + x5 \\ x3 + x4 \end{bmatrix}$$
(1)  
$$\begin{bmatrix} X 1 \\ X 3 \\ X 5 \\ X 7 \end{bmatrix} = \begin{bmatrix} D & E & F & G \\ E & -G & -D & -F \\ F & -D & G & E \\ G & -F & E & -D \end{bmatrix} \begin{bmatrix} x 0 - x7 \\ x1 - x6 \\ x2 - x5 \\ x3 - x4 \end{bmatrix}$$
(2)

October 2015 Page 282



# ISSN No: 2348-4845 International Journal & Magazine of Engineering, Technology, Management and Research

A Peer Reviewed Open Access International Journal

where 
$$A = \cos(\frac{\pi}{4})$$
,  $B = \cos(\frac{\pi}{8})$ ,  $C = \sin(\frac{\pi}{8})$ ,  $D \cos(\frac{\pi}{16})$ ,  
 $E = \cos(\frac{3\pi}{16})$ ,  $F = \sin(\frac{3\pi}{16})$  and  $G = \sin(\frac{\pi}{16})$ 

### **Distributed Arithmetic:**

Distributed arithmetic (DA) is an important FPGA technology. It is extensively used in computing the sum of products.

$$y = \langle \mathbf{c}, \mathbf{x} \rangle = \sum_{n=0}^{N-1} c[n] \times x[n].$$

To understand the DA design paradigm, consider the "sum of products" inner product shown below:

(3)

$$y = \langle c, x \rangle = \sum_{n=0}^{N-1} c[n] \times x[n]$$
  
=  $c[0]x[0] + c[1]x[1] + \ldots + c[N-1]x[N-1].$  (4)

Assume further that the coefficients c[n] are known constants and x[n] is a variable. An unsigned DA system assumes that the variable x[n] is represented by

Where xb[n] denotes the bth bit of x[n], i.e., the nth sample of x. The inner product y can, therefore, be represented as

Redistributing the order of summation (thus the name "distributed arithmetic") resulted

$$x[n] = \sum_{b=0}^{B-1} x_b[n] \times 2^b \quad \text{with } x_b[n] \in [0, 1]$$

$$y = \sum_{n=0}^{N-1} c[n] \times \sum_{b=0}^{B-1} x_b[n] \times 2^b.$$
(6)

### **III. IMPLEMENTATION**

Computation of the DCT

The 8 x 8 DCT coefficient matrix can be written as

Even rows of C are even-symmetric and odd rows are odd-symmetric. Therefore by exploiting this symmetry in the rows of C and separating even and odd rows we can get 1D-DCT as follows,

$$\begin{bmatrix} y_{b} \\ y_{2} \\ y_{4} \\ y_{b} \end{bmatrix} = \begin{bmatrix} a & a & a \\ c & f & -f & -c \\ a & -a & -a & a \\ f & -c & c & -f \end{bmatrix} \begin{bmatrix} y_{0} + x_{0} \\ y_{2} \\ y_{3} \\ y_{5} \end{bmatrix} = \begin{bmatrix} b & d & e & g \\ d & -g & -b & -e \\ e & -b & g & d \\ g & -e & d & -b \end{bmatrix} \begin{bmatrix} x_{0} - x_{1} \\ x_{2} - x_{3} \\ x_{5} - x_{4} \end{bmatrix}$$

$$\dots \dots \dots (8)$$

1D-DCT is written as follows,

$$\begin{bmatrix} y_{0} \\ y_{1} \\ y_{2} \\ y_{3} \end{bmatrix} = \begin{bmatrix} a & c & a & f \\ a & f & -a & -c \\ a & -f & -a & c \\ a & -c & a & -f \end{bmatrix} \begin{bmatrix} x_{0} \\ x_{2} \\ x_{4} \\ x_{5} \end{bmatrix} + \begin{bmatrix} b & d & e & g \\ d & -g & -b & -e \\ e & -b & g & d \\ g & -e & d & -b \end{bmatrix} \begin{bmatrix} x_{1} \\ x_{3} \\ x_{7} \end{bmatrix}$$
$$\begin{bmatrix} y_{7} \\ y_{6} \\ y_{5} \\ y_{4} \end{bmatrix} = \begin{bmatrix} a & c & a & f \\ a & f & -a & -c \\ a & -f & -a & c \\ a & -c & a & -f \end{bmatrix} \begin{bmatrix} x_{0} \\ x_{2} \\ x_{4} \\ x_{6} \end{bmatrix} - \begin{bmatrix} b & d & e & g \\ d & -g & -b & -e \\ e & -b & g & d \\ g & -e & d & -b \end{bmatrix} \begin{bmatrix} x_{1} \\ x_{3} \\ x_{3} \\ x_{7} \end{bmatrix}$$

,.....(9)



|          | $Z_0$       | $Z_4$    |             |  |
|----------|-------------|----------|-------------|--|
| weight   | value       | weight   | value       |  |
| $-2^{0}$ | 0           | $-2^{0}$ | $A_1$       |  |
| $2^{-1}$ | $A_0 + A_1$ | $2^{-1}$ | $A_0$       |  |
| $2^{-2}$ | 0           | $2^{-2}$ | $A_1$       |  |
| $2^{-3}$ | $A_0 + A_1$ | $2^{-3}$ | $A_0$       |  |
| $2^{-4}$ | $A_0 + A_1$ | $2^{-4}$ | $A_0$       |  |
| $2^{-5}$ | 0           | $2^{-5}$ | $A_1$       |  |
| $2^{-6}$ | $A_0 + A_1$ | $2^{-6}$ | $A_0$       |  |
| $2^{-7}$ | 0           | $2^{-7}$ | $A_1$       |  |
| $2^{-8}$ | $A_0 + A_1$ | $2^{-8}$ | $A_0 + A_1$ |  |

#### Table 5.1

Volume No: 2 (2015), Issue No: 10 (October) www.ijmetmr.com October 2015 Page 283



**IV. RESULTS:** 

# ISSN No: 2348-4845 International Journal & Magazine of Engineering, Technology, Management and Research

A Peer Reviewed Open Access International Journal

Input data A0 and A1, the transform output Zee needs only one adder to compute (A0 + A1) and two separated ECATs to obtain the results of Z0 and Z4. Similarly, the other transform outputs Zeo and Zo can be implemented in proposed method-based forms using 10(=1 + 9) adders and corresponding ECATs. Consequently, the proposed 1-D 8-point DCT architecture can be constructed as illustrated in Fig. 3 using a DA-Butterfly-Matrix, that includes two DA even processing elements (DAEs), a proposed method odd processing element (DAO) and 12 adders/subtractors, and 8 ECATs (one ECAT for each transform output Zn). The eight separated ECATs work simultaneously, enabling high-speed applications to be achieved. After the data output from the proposed method-Butterfly-Matrix is completed, the transform output Z will be completed during one clock cycle by the proposed ECATs. In contrast, the traditional shift-and-add architecture requires Q clock cycles to complete the transform output Z if the proposed method-precision is Q-bits.

| Current Simulation<br>Time: 1000 ns |          | D | 5       | Ô                 | 10     | 0      | 150    | 200 | 250   |
|-------------------------------------|----------|---|---------|-------------------|--------|--------|--------|-----|-------|
| ■ <mark>84</mark> 20(11:0)          | 1        |   | 12h000  | 121544            |        |        |        |     |       |
| 2010 2010 100                       | 1        | Ō | 12h000  | 12165F            |        | 12hFBE |        |     |       |
| 3 (22(11:0)                         | 1        | 0 | 12h000  | 121687            |        | 121647 |        |     |       |
| 🖬 💐 Z3(11:0)                        | 1        | 0 | 12h000  | 12h000 12h082     |        | 12hEE3 |        |     |       |
| 🖬 💐 Z4(11:0)                        | 1        | 0 | 12h000  | 12h000 12hE46     |        | 121645 |        |     |       |
| 25[11:0]                            | 1        |   | 12h000  | 12h000 ( 12h0F8 ) |        | 12hEF9 |        |     |       |
| 🖬 💐 25(11:0)                        | 1        | 0 | 12h000  | 2h000 ( 12h0F6 )  |        | 12h176 |        |     |       |
| 27[11:0]                            | 1        | 0 | 12h000  | 121039            |        | 12hF99 |        |     |       |
| <b>CUK</b>                          | 0        |   |         |                   |        |        | لولاوا |     | ا ر ک |
| RST                                 | 0        |   |         |                   |        |        |        |     |       |
| 🖬 🐉 (xa(8:0)                        | 9        |   | 9h105   |                   | 91/125 |        |        |     |       |
| ■ <mark>\$1</mark> (x1(8:0)         | 9        |   | 9h0A1   |                   | ShtAt  |        |        |     |       |
| ■ <mark>\$1</mark> (X2(8:0)         | g        |   | Sh118   |                   |        |        |        |     |       |
| <b>0 Ş(</b> )3(8:0)                 | 9        |   | 9h082   |                   | 91182  |        |        |     |       |
| 🖬 💐 X4(8:0)                         | 9        | 1 | sh1FE   |                   | 1      | 9h1DE  |        |     |       |
| ■ <mark>81</mark> 05(8:0)           | 9        | 1 | 9h036 🛛 |                   |        | 91138  |        |     |       |
| 🖬 💐 XS(8:0)                         | <u>9</u> |   | 9h19F   |                   |        |        |        |     |       |

Fig1: Simulation results of Distributed Arithmetic DCT

DA-based DCT core with an error-compensated addertree (ECAT) program Modules will be taken to the Xilinx tool. The check syntax, synthesis and simulation completed successfully. The test bench written for main module, initially clock and reset pins set to '0' and all the inputs considered as '0'.After 10ns the reset pin set to '1' upto 40ns and we give the desired input values of 9-bit hexadecimal values then we obtain the 12-bit output values. After 40ns the reset pin set to zero, similarly for 100ns and 200ns we give the input values and obtain the output values.

# **Input Image**



# **Output resultant**



|           | Conventional | Proposed(kb) |  |  |
|-----------|--------------|--------------|--|--|
|           | (kb)         |              |  |  |
| Input.jpg | 10           | 2            |  |  |

# V. CONCLUSION:

The proposed method Discrete Cosine Transform (DCT) was designed successfully and the coding was done in Verilog HDL. The RTL simulations were performed usingxilinx. The synthesis was done using Xilinx ISE 12.3i DA DCT Design is verified for all test cases. The DCT works properly for all the test values.

### **REFERENCES:**

1. Y.Wang, J. Ostermann, and Y. Zhang, Video Processing and Communications,1st ed. Englewood Cliffs, NJ: Prentice-Hall, 2002.

2. Y. Chang and C.Wang, "New systolic array implementation of the 2-Ddiscrete cosine transform and its inverse," IEEE Trans. Circuits Syst.Video Technol., vol. 5, no. 2, pp. 150–157, Apr. 1995.

### Volume No: 2 (2015), Issue No: 10 (October) www.ijmetmr.com



# ISSN No: 2348-4845 International Journal & Magazine of Engineering, Technology, Management and Research

A Peer Reviewed Open Access International Journal

3. C. T. Lin, Y. C. Yu, and L. D. Van, "Cost-effective triple-mode reconfigurable pipeline FFT/IFFT/2-D DCT processor," IEEE Trans. VeryLarge Scale Integr. Syst., vol. 16, no. 8, pp. 1058–1071, Aug. 2008.

4. S. Uramoto, Y. Inoue, A. Takabatake, J. Takeda, Y. Yamashita, H. Yerane, and M. Yoshimoto, "A 100-MHz 2-D discrete cosine transformcore processor," IEEE J. Solid-State Circuits, vol. 27, no. 4, pp.492–499, Apr. 1s992.

### **Author's Profile:**



Eskala Divya Teja, Pursuing M.Tech (VLSI), Global College of Engineering & Technology, Kadapa, Andhra Pradesh, India.



Y.Maheswar, M. Tech, (Ph.D), Associate Prof., Department of ECE, Global College of Engineering & Technology, Kadapa, Andhra Pradesh, India