Font Size: a A A

Research On H.264 Region-of-Interest Coding Based On Visual Perception

Posted on:2009-04-05Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y Y ZhengFull Text:PDF
GTID:1118360275982695Subject:Electronic information technology and instrumentation
Abstract/Summary:PDF Full Text Request
The video coding technology, one of the key technologies in the effective transmission and the storage of the video information, takes an important part in the modern information technology. H.264/AVC (H.264 for short) is the newest video coding standard jointly recommended by ITU and ISO/IEC. In the developing history of the video coding technology, how to achieve the optimal rate-distortion performance under the constraints of the complexity and the allowed delay remains the core problem of the video coding design. In the past, the rate-distortion performance of the video coding was mainly improved by the reduction of the spatial, temporal and statistic redundancies, while nowadays the region-based video coding technology using the visual processing becomes a major research direction in the video coding domain. The perception of HVS (Human Visual System, HVS) for the video scene is selective, and different regions or objects in the video scene have diverse levels of visual importance. However, the conventional video encoding algorithm ignores this diversity of perception mechanism. Therefore, it is of theoretical meaning and practical value to take an in-depth study on the improvement of the compression and computation efficiencies of H.264 encoding algorithm by applying the principle of the visual perception of HVS.In chapter 1, the significance of my research work is presented together with a brief summary of the present research status.Chapter 2 proposes a fast GME method based on the principle of the symmetry elimination and difference of motion vectors to reduce the computational complexity of global motion estimation (GME). The proposed method consists of two stages. Firstly, the translational parameters are achieved by using the technique of the symmetry elimination of motion vectors. And then the transform parameters are estimated by the principle of the difference of motion vectors and the strategy of the belief judgment. As a result, the effective and efficient estimation of global motion parameters lays a foundation for the following research.In chapter 3, a novel moving region detection method in H.264 compressing domain is presented, in which the side encoding information, including motion vectors (MV) and sum of absolute differences (SAD), are applied as the input features. The proposed detection method is composed of three processing steps. In the first step the global motion estimation/compensation processing and the spatio-temporal filter method for MV are used to detect the moving regions with the salient motion. Then, the x~2 distribution about the SAD information at zero MV is to be constructed. Next, a change detection algorithm derived from the F hypothesis test is applied to detect the moving regions including the salient and non-salient motions. Finally, the detected results of the two steps described above are adopted to compute the final moving region map.In chapter 4, a novel visual perception model, composed of motion perception, texture perception and spatial position perception sub-models, is proposed by fusing the spatio-temporal visual features. First of all, in order to simulate HVS's perception for moving regions, the motion perception of HVS is modeled by fusing the motion visual features including motion velocity, motion direction, motion coherence and biological motion. Then, the texture perception of HVS is modeled based on the perception mechanism of the visual sensitivity and the visual masking effect in HVS to simulate HVS's perception for texture complexity. Finally, the spatial position perception of HVS is modeled on the basis of the perception mechanism of the fovea and the eye movement in HVS. Therefore, the spatial position perception sub-model can adaptively adjust the perceptual importance of different positions in video scene according to the global motion type.Chapter 5 brings forward a novel H.264 region-of-interest coding method based on the visual perception to allocate the bit and computation resources. By the proposed visual perception model the visual perception map (VPM) can be computed. Firstly, In order to allocate the bit resource effectively through the VPM, an adaptive frequency coefficient suppression technique is derived from the principle that HVS is less sensitive to the distortion of high frequency signals. Secondly, the distribution characteristic of the bit resource is theoretically and experimentally analyzed. Finally, the optimal bit resource allocation is achieved according to a novel encoding strategy. In order to allocate the computation resource effectively based on the VPM and the global motion type of the video scene, the relation between the optimal encoding mode and features of the contents of the video scene is experimentally analyzed, during which a fast and effective H.264 mode analysis algorithm is deduced.The final chapter concludes the new achievements of the whole research and the prospect of the future research.
Keywords/Search Tags:video coding, visual perception, region-of-interest (ROI), H.264, global motion estimation, moving region detection, fast mode analysis, feature fusion
PDF Full Text Request
Related items