Font Size: a A A

Research On Key Technology For H.264/AVC Video Coding

Posted on:2010-04-13Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y Y DingFull Text:PDF
GTID:1118360272996784Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
H.264/AVC is the newest international video coding standard jointly developed by the ITU-T VCEG and the ISO/IEC MPEG committees. It is used for both broadcasting and telecommunications, and coveraged variety of applications, such as low bit rate communications and high definition television and so on. H.264/AVC standard has good compression performance, so it is widely used. It represents the state-of-the-art video compressing technology, and adopts many new features, including integer discrete cosine transform, enhanced intra prediction, variable block size motion estimation, multiple reference frames prediction, adaptive in-loop de-blocking filter and context-based entropy coding. The compression efficiency can be improved more than doubled under the same reconstructed picture quality comparing with MPEG-2 and H.263, and the complexity tower above several times. It needs more time and system resources, and this will limit the application of H.264/AVC in real time video coding. So how to ensure the coding efficiency and reduce the complexity computational becomes the industry's research focus.In this paper, the modules and functions in JM10.2 provided by ITU-T were analyzed by Intel VTune Performance Analyzer. Experimental results show that the main time-consuming modules in the encoder are intra prediction mode selection, inter prediction mode selection, multiple reference frames selection and motion estimation. In view of the above modules, theoretical analysis and experimental validation are performed in this paper and the corresponding fast algorithms are researched.H.264 supports 13 kinds of luma intra prediction modes and 4 kinds of chroma intra prediction modes. Therefore, inorder to find the best intra prediction modes in all coding modes, the encoder must complete 592 computations of cost, the complexity increased drastically. A fast intra mode selection algoriothm FIMDA is proposed in this paper. Three aspects included in this algorithm:①Breaking through the spatial limit of intra prediction, the algorithm introduced the information of the correlated block in the former frame. Combing the MAD information, It can decide the intra prediction size 4×4 or 16×16.②For Intra4×4 prediction, a new similarity criterion is introduced, and for the 4×4 blocks which meet the certain condition, the prediction process for them can be skipped.③For the 4×4 blocks which deviate from conditions, the candidate prediction modes are decided by the SATD information and the similarity of prediction modes located at the adjacent direction. The algorithm can reduced at least 60% encoding time at all intra frames encoding structure, and reduced about 27% encoding time at IPPP encoding structure, SNRY is almost invariable.Inter prediction is an important method to remove the temporal redundant information. H.264 has 7 inter partition modes, it also supports SKIP mode and intra prediction modes, the encoder must complete 760 computations of cost and get the best inter prediction mode. The computational complexity is very high and it is not conducive to real-time applications. The fast inter mode selection algorithms SSI, SGI and FSSI are proposed in this paper.SSI adopts the big modes detection algorithm based on a spatial video segmentation method and early skip mode detection algorithm ES, and the algorithm has better effect on simple texture video sequence. In order to detect big modes, SGI introduced a temporal video segmentation method and the detection rate is up to 80%, combing ES algorithm, it will further improve the encoding speed. The effect of SGI depends on the degree of the movement of video sequence, and it is independent on the texture.The correlations of Jmotion and Jmode are used to filter the prediction modes in FSSI, and it can terminate about 87% computations of P8×4, P4×8 and P4×4. Comparing to full search of JM10.2, the SNRY of SSI, SGI and FSSI drops about 0.020dB,0.028dB and 0.044dB respectively, the encoding time reduced about 30%~74%.H.264 introduces multiple reference frames motion estimation technology, it will get more accurate prediction values. And the quality of decoding image is better while the coding efficiency is high. But the increased computation is linearly proportional to the number of reference frames. Fast multiple reference frames selection algorithms FMRSGC and FMRSMVC are proposed. The Gaussianity test based on the third-order moments are introduced to decide Ref0 in advance, and the detection rate is up to 83%. Simultaneity, deciding the candidate reference frames based on the correlation information, it will further improve the encoding speed. FMRSMVC detects the Ref0 in advance based on the characteristics of the motion vectors of the four 8×8 blocks in one macroblock, the detection rate is up to 80%. Simultaneity, the edge detection algorithm based on 8×8 blocks is performed and the spatial correlation is decided, and then the candidate reference frames are decided. Compared with the full search of five reference frames, SNRY for FMRSGC and FMRSMVC drops about 0.05dB, and the encoding time reduced about 25%~74%.Motion estimation is the most time-consuming module in the encoder. Based on the researches of BMA algorithms, MPFME and MMPFME are proposed in this paper. MPFME mainly adopts the following technologys:①The author analyzed the veracity of the predicted motion vectors and decided the PRI;②Based on the motion characteristics of the encoding macroblock, it can select appropriate searching patterns and decide the pattern conversion criteria.③Limiting the search range based on the Gaussianity test related to the fourth-order moments. MMPFME mainly adopts the following technologys:①Based on the motion vectors of 4×4 blocks, the context of cache can be established, and the attribute of the current 4×4 block can be decided by frame difference and background difference,combining the correlation information,it can decide the prediction motion vectors,the search pattern and the search range.②Merging the motion vectors of P4×4 from botomm to up.It also can adaptively adjust the search range and the search pattern based on the motion characteristics. The searching points reduced about 57%~80% compared to UMHexagonS, SNRY drops less than 0.02dB.Baed on the researches of the above fast algorithms, the optimized H.264 video encoder is designed, and tests several QCIF sequences. The experimental results show that the encoding time reduces about 73%~94% with the small loss of SNRY. The structure of video display system based on the optimized encoder is designed ulteriorly, and the FPGA design for partial module has been done in this paper.
Keywords/Search Tags:video coding, H.264/AVC, intra prediction, inter prediction, motion estimation, multiple reference frames, video segmentation
PDF Full Text Request
Related items