Font Size: a A A

Design And Optimization Of Efficient Architecture For Motion Estimation

Posted on:2012-04-30Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y B ChenFull Text:PDF
GTID:1118330335462528Subject:Circuits and Systems
Abstract/Summary:PDF Full Text Request
Block matching motion estimation which is adopted widely in mainstream video compression standards is used to reduce the bit-rate in video compression systems by exploiting the temporal redundancy between successive frames. For practical realization of video encoder for consumer products, e.g. mobile multimedia phone, it is desirable that the video encoder can compress video frames within certain pre-allocated computational budget as the computational power of these consumer products would be limited. This underlines the need for high-speed low-power silicon processing architectures for implementing video compression algorithms at different levels of resolution. Generally, motion estimation computing array (MECA) performs up to 50% of computations in the entire video coding system, and is typically considered the computationally most important part of video coding systems. Thus, integrating the MECA into a system-on-chip design has become increasingly important for video coding applications.Intensive study of efficient (high-throughput, low power, low bandwidth) architecture of motion estimation is conducted in this doctoral dissertation.The main work and innovation are listed as follows:1) An area-efficient low bit-depth representation based full search block motion estimation engine is proposed to meet the processing requirements of real-time low-complexity video compression. The source pixel based linear arrays (SPBLA) is adopted for the system-level architecture. Furthermore, towards system bottlenecks which are ROM-based systolic cell and redundant data memory organization, optimized structures are presented. Implementation results show that, compared with the hardware in former literatures, the proposed hardware can achieve significant improvement in area and no throughput is lost.2) In order to meet the processing requirements of portable real-time full HD video compression, a novel macroblock-level parallel architecture based on SPBLA is proposed to overcome the problem of massive amount of resources and large delay caused by 2-D arrays used in literatures. The proposed architecture is easy to extend and area-economical. Furthermore, towards the system bottlenecks, systolic cell and data memory organization, optimized structure are presented. Implementation results show that, compared with the traditional architecture, the proposed architecture can achieve the improvements of speed and area at the same time.3) A motion estimation engine with dynamic search range is proposed to overcome the limitation of fixed search range which is widely used in previous work. By means of an adjustment of search range, not only the off-chip memory bandwidth, but also the computational complexity and the power dissipation are reduced. A circular distributed storage structure is presented to realize the data access of the dynamic search area and the address logic is simple to be implemented by LUT. Moreover, the time-consuming SAD computing array is divided into a balanced and adder-optimized pipeline. Implementation results show that, compared with traditional engines, the proposed engine can achieve significant improvements of the hardware and the power efficiency at a cost of minor loss of throughput.4) A novel structure of motion estimation based on improved normalized partial distortion search is proposed to meet three primary requirements for real-time video encoding, which are low-power, low-bandwidth and high area utilization efficiency. The ME engine supports both normalized partial distortion search and adaptive search range adjustment. The former one can reduce the computational complexity of ME to save power and area; the latter one can avoid unnecessary accessing of the external memory to lower the data bandwidth. Implementation results show that, compared with traditional engines, the engine can achieve significant improvements of the hardware and the power efficiency at a cost of minor loss of throughput.
Keywords/Search Tags:motion estimation, variable block size, systolic array, low power, low bandwidth, VLSI
PDF Full Text Request
Related items