Research On Real Time Video Compression Techniques And Algorithims Based On H.264

Posted on:2008-01-31

Degree:Doctor

Type:Dissertation

Institution:University

Candidate:Jamil-ur-Rehman

Full Text:PDF

GTID:1118360245996560

Subject:Information and Communication Engineering

Abstract/Summary:

PDF Full Text Request

High speed connections to the home are commonplace and the storage power of flash memories, hard disks and optical media is more than ever before. The cost per transmitted or stored bit is falling continuously, then why video compression is needed and why there is such a momentous effort to make it better? Video compression has two important advantages. First, it makes it feasible to use digital video in transmission and storage environments that would not support uncompressed video. Second, video compression allows more professional use of storage and transmission resources.Image and video compression has been a very active topic of research and development for over 20 years. Different algorithms and systems for compression and decompression have been projected and developed. In order to promote competition and increased choice, it has been essential to define standard methods of compression for encoding and decoding to allow products from different companies to communicate efficiently. This has initiated to the development of a number of International Standards for image and video compression, together with the JPEG, MPEG and H.26Ã—series of standards.Video compression algorithms work by removing redundancies in the temporal, spatial and frequency domains. By removing different types of redundancies it is possible to compress the data considerably at the cost of a certain amount of information loss. More compression can be attained by encoding the processed data using an entropy coding technique like Huffman coding or Arithmetic coding. H.264 has appreciably enhanced the coding performance in both low and high bit rates as compared with earlier coding standards ( H.263, MPEG-2 and MPEG-4).H.264/MPEG4 part 10 uses the rate distortion optimization (RDO) technique to obtain the best result in terms of visual and coding performance. In order to perform RDO, the encoder encodes video by fully searching the best mode in the RD sense among different predefined modes. Consequently, the computational complexity of the encoder is dramatically increased, which makes it hard for practical applications such as real time video communication. This dissertation addresses how to reduce the computational complexity associated with H.264/MPEG4 part 10. The major achievements in this dissertation are summarized below.We proposed fast intra prediction mode decision using parallel processing. In the scenarios of real-time multimedia, computational complexity becomes a key constraint. Several attempts have been made to explore fast algorithms for intra prediction mode decision. Most existing"fast"intra prediction algorithms reduce computation by decreasing the number of candidates. A reduction in computational complexity of video encoders affects decoded video quality. The full search algorithm in H.264 computes and compares all modes, so it is sure to choose the best mode. We used parallel processing to solve this problem because parallelism cuts down significantly on the time it takes to reach a solution to a research problem, and increase the size of the problem scientists can tackle. We selected FPGA (Field Programmable Gate Array) platform for parallel processing. Hardware circuits such as the FPGA run in parallel, because each sub circuit executes their function independently. We implemented nine intra 4Ã—4 luma modes on FPGA using both approaches i.e serial and parallel processing. Experimental results show that the time to find the best intra prediction mode by parallel processing is much less than the time by serial processing with no performance degradation.We proposed efficient techniques for signaling intra prediction modes number. The choice of intra prediction mode for each 4Ã—4 block must be signaled to the decoder and this could potentially require a large number of bits. However, intra modes for neighbouring 4Ã—4 blocks are often correlated. To take advantage of this correlation, predictive coding is used to signal 4Ã—4 intra modes. At the boundaries of frame, we can't apply all modes because of the limitation of available pixels for prediction. Now question arises, is it feasible to use same technique to signal less number of modes as for nine modes? We proposed different techniques than the given technique for signaling 4Ã—4 intra prediction modes. The proposed technique for signaling three modes (1, 2 and 8) of upper frame/slice boundary is as following. The encoder sends a flag for each 4Ã—4 block, previous intra 4Ã—4 prediction mode, if the flag is'1', the most probable prediction mode is used. If the flag is'0', another one bit is sent to indicate remaining two modes. We proposed three different techniques for signaling four modes (0, 2, 3 and 7) of left frame/slice boundary, the best technique is as following. The encoder sends a flag for each 4Ã—4 block, previous intra 4Ã—4 prediction mode, if the flag is'1', the most probable prediction mode is used. If flag is'0', another flag is sent to indicate the next most probable mode, if this flag also'0', another one bit is sent to indicate remaining two modes. Experimental results show that the proposed techniques outperform the existing technique.We proposed another technique for fast intra prediction mode decision by selecting fewer number of modes. As we mentioned in the previous paragraph, it is not practical to apply all 4Ã—4 luma intra prediction modes at the frame/slice boundaries, bits can be saved for signaling fewer intra prediction modes. Only three 4Ã—4 intra prediction modes (1, 2 and 8) can be applied at the upper frame/slice boundary, four intra prediction modes (0, 2, 3 and 7) can be applied at the left frame/slice boundary, seven 4Ã—4 intra prediction modes (0, 1, 2, 4, 5, 6 and 8) can be applied at the right frame/slice boundary and nine intra prediction modes can be applied at the rest of 4Ã—4 blocks. On the right frame slice boundary we selected only five modes instead of seven modes and computed RD performance of different combinations of five modes. Similarly instead of using nine modes we selected only five modes and computed RD performance of different combinations of five modes. After analyzing experimental results we have come to know that the combination of five modes (0, 1, 2, 4 and 8) on right boundary gives best results. And the combination of five modes (0, 1, 3, 4 and 8) for rest of 4Ã—4 blocks gives best RD performance. The proposed technique for signaling five intra prediction modes is as following. The encoder sends a flag for each 4Ã—4 block, previous intra 4Ã—4 prediction mode. If the flag is'1', the most probable prediction mode is used. If the flag is'0', another parameter (2 bits) remaining intra 4Ã—4 prediction modes is sent to signal remaining four modes. Experimental results show that the increase in the number of bits for encoding residual coefficients is approximately same as decrease in the number of bits required to signal intra prediction modes, providing almost same PSNR (peak signal to noise ratio). By using proposed techniques, computational speed for finding best 4Ã—4 intra prediction mode is increased by about 45% without significant performance degradation.We also studied the effect of adaptive probability updating of look-up table values for encoding Coefficient Token. In the new H.264/AVC standard, when entropy coding mode is set to zero, residual block data is coded using a context adaptive variable length coding (CAVLC) scheme. The first VLC, coefficient token, encodes both the total number of nonzero coefficients and the number of trailing ones. There are four choices of look-up table to use for encoding coefficient token for a 4Ã—4 block. We studied the results of assigning shorter codes to more probable pairs (Total coefficients, T1s) adaptively and vice versa. There are big gaps between the probability lines of three pairs((0,0),(1,1) and (2,2)), so adaptive probability updating can't give better results. The probability lines of other pairs intersect each other and adaptive probability updating gives better results, but the probability (â‰ˆ10 %) of these pairs is very small.

Keywords/Search Tags:

Fast intra prediction, Parallel processing, Signaling modes, Coefficient token

PDF Full Text Request

Related items

1	The Research And Implementation On Hevc In Cu Splitting And Fast Algorithm Of Intra Prediction Modes
2	Research On Intra Coding Fast Algorithm Of H.266/VVC
3	Research On Efficient Algorithms For The Selection Of Intra-predection Modes In AVS2 Encoding
4	Research On Fast Algorithms Of Intra-prediction Mode Decision In H.264
5	Research On Fast Intra Prediction Algorithms For H.264/AVC
6	The Research On Fast Intra Prediction Algorithm For Virtual Reality 360 Degree Video Based On Improved RMD
7	The Research On Fast Intra-prediction Algorithms For AVS3
8	Fast Algorithm Of Intra Prediction For VP9
9	"Research And Implementing Of Fast Intra Prediction Based On HEVC"
10	Research On GPU-based Parallel Intra-frame Coding Mechanism Of HEVC