Font Size: a A A

Research On Pixel Matching Computation Acceleration In Video And Image Processing

Posted on:2012-01-10Degree:DoctorType:Dissertation
Country:ChinaCandidate:H T GuFull Text:PDF
GTID:1118330341951742Subject:Electronic Science and Technology
Abstract/Summary:PDF Full Text Request
Pixel matching method is widely used in communication, medical care, education and military domain. The typical pixel matching computations include motion estimation of video coding, correlation matching of object tracking and recognition and so on. As the rapid development of video and image processing applications, the image resolution and frame frequency keep to improve, and the the computational complexity of pixel matching greatly increases, presenting a great challenge for the performance of digital signal processor. Therefore, the acceleration research of pixel matching computation is significant for improving the processor performance and meeting the performance requirement of real-time video and image processing computation.Pixel matching is a computation-intensive, data-intensive and real-time method. Hardware accelerator is an efficient approach to implement real-time pixel matching computation Motion estimation and correlation matching, which are the classic pixel matching in video and image processing, are studied in this paper. And several acceleration techniques on pixel matching computation are explored to improve the flexibility and performance of motion estimation and correlation matching hardware accelerator, optimize the fast algorithm implemented by hardware, and increase transmission efficiency and flexibility of the interface between hardware accelerator and processor. The detailed performance analysis and evaluation for every technique are carried out on our DSP platform. The main contributions and innovations of this thesis are as follows:1) A multiple search centers motion estimation algorithm suitable for hardware implement is proposed to speed up the computation. The proposed algorithm is based on multi-search centers prediction and dynamic search range adjustment. The multi-search centers prediction analyzes motion vectors of spatial and temporal adjacent blocks and predicts multiple motion vectors for the current block. Compared with the traditional motion vector prediction, the proposed prediction method can improve up to 12.9% prediction accuracy. According to the count and magnitude of the predictive search centers, the search range is dynamically adjusted to further reduce the computational complexity. Compared to the FFS, UMHexagonS and EPZS algorithms, the proposed algorithm can gain similar rate-distortion performance, while reducing about 89.9%-98.4%, 46.5-67.9%, and 20.0-46.8% computational complexity respectively. Similar to FFS, the search method of the proposed algorithm is easily implemented by hardware, because of its regular computation.2) A motion estimation coprocessor supporting multiple coding standards is presented. The coprocessor is designed based on very long instruction words architectures, and can effectively perform various motion estimation algorithms. In the proposed hardware architecture, a two dimension data-reused processing element array, a SAD tree structure, and a multiple modes cost comparator are employed. The processing element array and the SAD tree structure can efficiently meet the huge computational complexity of motion estimation, and the multiple modes cost comparator is used to support different block partition modes of various video coding standards. In comparison to other hardware accelerators, the proposed coprocessor can not only meet the computation requirement of real-time HD application, but also have enough flexibility to support various motion estimation algorithms.3) A multi-standard configuration interpolation architecture is proposed. It consists of two independent 8-tap filters, which are used to implement horizontal and vertical filter for a frame respectively. A parameter memory is used to store different filter parameter for various video standards. Two filters can reduce about 46% interpolation computation by a two-step filter method. Compared with previous works, the proposed interpolation architecture can obtain higher performance with low area. Working on 250MHz, the proposed architecture can satisfy the interpolation computation requirements of encoding HD video sequences.4) This paper proposes an efficient SAD accelerator. A processing element array and an adder tree structure are used to improve the execution speed of SAD computation. The pipeline of the PE array and the adder tree is partitioned carefully in order to increase the work frequency. The area and power of the accelerator are reduced by optimizing the processing elements arrary size and memory size. Compared to previous works, the proposed SAD accelerator is the most efficient. For 64x64 input real-time image at 60fps sample rate, the proposed accelerator can match up to 162 template image.5) A user-defined processor core interface for hardware accelerator is proposed in this paper. Based on a user-defined interface description, the protocol wrapper can be automatically generated. The proposed interface supports many usual bus protocols, such as AHB, PVCI, and so on. Through the proposed interface, Accelerator instructions executed by the processor core can directly control the high-width data transmission between the processor and the accelerator. Compared to the traditional DMA bus, the proposed interface can save up to 83.7% and 87.1% transmission time.
Keywords/Search Tags:Pixel Matching, Hardware Accelerator, Motion Estimation, Sub-Pixel Interpolation, Correlation Matching, Automatic Interface Generation
PDF Full Text Request
Related items