Font Size: a A A

Research On Architecture Design Of VLSI Implementation For High Speed Image Compression Coder

Posted on:2007-09-26Degree:DoctorType:Dissertation
Country:ChinaCandidate:C Y XiongFull Text:PDF
GTID:1118360242461819Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
With the improvement of resolving power of remote sensor, and the increases of the number of wave band and the breadth, the size of original image data acquired by the remote sensor built in the satellites become more and more larger. Because of the limited channel capacity of satellite communications, data compression technology should be used to efficiently transmit this large size of data. As a latest still image compression international standard, JPEG 2000 provides many supports for compression of the large size of data, such as special for remote sensing image, with the excellent compression performance. However, the algorithms recommended by JPEG 2000 have higher complexity. In order to meet the real-run requirement of data compression system embedded in the satellite, hardware implementation of a high performance data compression coder based on JPEG 2000 frame is needed and significant.Aiming to very large scale integration (VLSI) circuits design of high speed remote sensing image compression coder based on JPEG 2000 frame, the research on two issues is carried out in this dissertation. One topic is focused on the study of parallelism of discrete wavelet transform (DWT) and its high speed VLSI architecture design. Another one is focused on the investigation of parallelism of embedded block coding with optimal truncation (EBCOT) algorithm and its high speed VLSI architecture design. Verilog HDL models for all the designs are described, which have been synthesized into Altera's FPGA using integration software tool QuartusII, and time simulations were performed.The high speed/lower power architecture designs for the discrete wavelet transform are explored firstly in this paper. The boundary data processing for the finite length sequence is simple when the second generation wavelet based on lifting scheme is adopted. An embedded boundary data processing technique and its hardware implementation are proposed, which could reduce efficiently the number of registers and the size of on chip memory, and the number of accessing external memory, resulting in reduction of hardware complexity and decrease of power consumption. Although lifting-based wavelet transform has more advantages over the conventional convolution-based one, the critical path of the former could be longer than that of the latter. A parallel-based lifting scheme of wavelet transform is presented, and an optimal design in terms of area and speed for the one dimensional architecture is introduced. The proposed parallel lifting method could significantly shorten the critical path, and increase the maximum working frequency of system. The critical path delay special for CDF(9,7) DWT could be reduced to 1 multiplication operation delay plus 4 addition operations delay without adopting pipeline technique. Lifting-based wavelet transform could efficiently reduce the number of arithmetic units by sharing the common resources of two channel filters, which reduces the hardware complexity and computation complexity. An embedded decimation technique is presented for the implementation of lifting wavelet transform, in which the prediction lifting operation and the updata lifting one belonging to the same lifting step share a set of arithmetic units, which could efficiently reduce hardware complexity. Only 3 multipliers and 4 adders are required for implementation of 1-D architecture of CDF(9,7) DWT. It also provides an efficient approach to implement direct two-dimensional wavelet transform that 2 data could be processed in each clock cycle, and the size of on-chip memory required for 2-D architecture of CDF(9,7) DWT is only 5.5*N (N denotes the width of original image).The size of on-chip memory of buffering the intermediate data is a key factor of determining the hardware complexity of two dimensional wavelet transform. A line-based architecture of direct two dimensional transform could efficiently reduce the size of buffer. An efficient line-based architecture of direct two dimensional wavelet transform is proposed by exploring the parallelism of four subband transforms of lifting-based two dimensional discrete wavelet transform, which has higher throughput rate and good performance in terms of speed to cost. Where, 4 data could be transformed in each clock cycle, while 4 subbands coefficients could be generated concurrently. Furthermore, an efficient high speed and lower power architecture of implementing multi-level two dimensional transform is introduced by combining the recursive pyramid algorithm and parallel and pipelined techniques, which could achieves any levels of 2-D DWT decomposition of a size N*N image in appromaximate N*N/4 clock cycles. The proposed architecture has not only high speed throughput rate but also higher hardware utilization. If we slow the working frequency, the proposed high speed architecture could be used in low power applications. The architecture design of three/multiple dimensional wavelet transform is discussed as well. Several efficient architectures for the three/multiple dimensional wavelet transform are proposed by extending the direct two dimensional one introduced above. Compared to the conventional convolution-based multiple dimentional architectures, our method could significantly reduce the hardware complexity.EBCOT is another core algorithm included in JPEG 2000. The fractional bit plane coding and context adaptive binary arithmetic coding are adopted, which are both bit-level operations. So, a lot of time is consumed by EBCOT module in JPEG 2000, and the EBCOT is a serious bottleneck of high speed JPEG 2000. An efficient word-level and sequential coding scheme and high speed architecture for bit plane coding are proposed, which achieve that multiple passes coding are completed in one scan, and that all different bit planes coding are performed concurrently. The proposed high speed architecture could achieve context formation of 4 coefficients in each clock cycle. The context adaptive binary arithmetic coding algorithm used in EBCOT is analyzed in detail. Two improved architectures of MQ coder are proposed, which have higher throughput rate. The VLSI architecture designs for JPEG 2000 core algorithms are explored. Some issues of how to increase the parallel processing ability and throughput of all the algorithm modules are discussed. Many valuable research results have been obtained, which could provide helpful theoretic guidance and key technology to develop a high performance real-time coder for remote sensing data compression.
Keywords/Search Tags:remote sensing image, high speed data compression, VLSI architecture, JPEG2000, discrete wavelet transform EBCOT
PDF Full Text Request
Related items