Font Size: a A A

Research On Optimization Of Advanced Video Coding And Its Extension

Posted on:2009-10-06Degree:DoctorType:Dissertation
Country:ChinaCandidate:L Q ShenFull Text:PDF
GTID:1118360245499304Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Compared with previous coding standards such as H.26x and MPEG, H.264 can achieve 50% bit rate reduction with the similar video quality. Excellent coding efficiency comes from adopting additional new tools such as variable size motion estimation, multiple reference frames and so on. However, these new tools also lead to extremely high computation and limit the use of H.264 in real-time applications. Thus, how to reduce the codec computation is an important aspect. H.264 rate control scheme assumes that the frame complexity varies gradually from frame to frame. However, this assumption of similar variation itself may not hold, especially for videos with high motion or frequent scene cuts. It may introduce a large distortion variation sensitive to human over frames. Thus, how to overcome the above limitation is an important aspect in implementation of a H.264 encoder. When H.264 achieves 50% increase code efficiency, the complexity also increases heavily. It is very difficult to increase more efficiency only under the traditional hybrid video coding scheme. Recently, technologies with consideration of human vision system (HVS) characteristic are applied in coding process to increase efficiency. Thus, how to efficiently eliminate the perceptual redundancy in video sequences, and how to incorporate the perceptual cues into the coding process are key issues of the future video coding. The research of this thesis will focus on these three aspects.To achieve maximal encoding speed, we analyze the most time-consuming operation such as integer transformation, motion estimation, mode decision and reference frame selection. Based on the correlation among the neighboring blocks, information of motion estimation from previous searched frames and texture information, we propose several fast algorithms as follows. After the analysis of quantization and transform operations, we first propose an enhanced all-zero block detection algorithm. We also propose an adaptive and fast fractional pixel search algorithm to reduce the computation load of motion estimation. The fractional search algorithm adopting the proposed bypass strategy and the early terminated strategy greatly improves the Center Biased Fractional Pixel Search method. A new fast intra prediction algorithm is proposed to improve the intra mode decision using three criteria, i.e., mode correlation of neighboring MBs, early termination based on all-zero block detection, and the prediction of intra mode size using texture feature of MB. A novel inter mode decision algorithm utilizing SAD of each 4×4 block from the initial test of inter 16×16 mode and texture characteristic is proposed to reduce the candidate mode set. In this thesis, we also propose an adaptive and fast multi-frame selection algorithm that uses the correlation among the neighboring blocks and information of motion estimation from previous searched reference frames.Recently, most H.264 fast algorithms only exploit the texture information in the spatial domain. They suppose that the texture homogeneous region tends to have similar motion while the region with complex texture has disorder motion. However, it does not always hold true in general video sequences since the region with complex texture may exhibit homogenous motion in some cases. In fact, the selected mode and reference frame have a large correlation with motion characteristic. MBs in the region with relatively static or smooth motion are more likely to be coded using larger block size (16×16) and the nearest reference frame, while for MBs in the region with complex motion or motion edges, more block sizes and reference frames should be evaluated. Thus, in this thesis, we propose a fast inter mode decision and a fast multiple-frame selection algorithm utilizing the spatial and temporal continuity of motion field. Experimental results show that the proposed algorithms can achieve remarkable computation cost saving and have a consistent gain for all video sequences compared with the methods using the texture homogeneous.To overcome the limitation of frame-level bit allocation, we improve H.264 rate control scheme using two tools, the incremental proportional-integral-differential (PID) algorithm and the frame complexity estimation. The incremental PID algorithm is first introduced to control the buffer and reduce the influence of the buffer abrupt fluctuation in the process of frame-level bit allocation. To reduce more video quality variations, the frame target bit allocation is also adjusted by the frame complexity from the residual energy. Experimental results show that the proposed rate control scheme decreases the average variations of video quality by 32%. To overcome the limitation of GOP-level bit allocation, we propose a novel GOP-level bit allocation based on incremental PID algorithm. Extensive experimental results show that the proposed scheme, without expensive computational complexity added, decreases the average video quality variations by 15%.Finally, we analyze the characteristics of human visual system (HVS), and introduce visual attention in the video coding process. Adaptive quantization method is proposed based on the visual interest effect. The key idea of this method is to make use of concept of visual attention and assign fewer bits to regions that are far from visually fixated location. In this thesis, we also propose a novel rate control algorithm based on visual attention. The bits allocated to each frame is proportional to the local motion attention in it, more bits are allocated to a frame if the local motion in it is stronger. More bits are assigned to visual attention MBs and fewer bits to visually less important MBs. Experiment results show that our method improve the visual quality in those frames with strong local activity and reduces temporal PSNR fluctuations across frames up to 12%. Meanwhile, the subjective quality is improved in terms of the PSNR increase in visual attention region, which is large up to 1.45dB compared with the traditional H.264 rate control scheme.
Keywords/Search Tags:Video coding, H.264, Algorithm optimization, Motion analysis, Rate control, Visual attention, Human Visual System
PDF Full Text Request
Related items