Font Size: a A A

Study On 3D Wavelet Scalable Video Coding Technology

Posted on:2008-10-21Degree:DoctorType:Dissertation
Country:ChinaCandidate:D D ZhangFull Text:PDF
GTID:1118360305956297Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
To reliably deliver video to varying clients over heterogeneous networks using available system resources, particularly in scenarios with unknown system resources and network conditions in advance, the coded bit-stream should provide the temporal, spatial, SNR and complexity etc. scalabilities to meet the requirements of the clients with diverse display resolutions, bandwidths, computational capability and memory capabilities. With the property of natural scalability of three-dimensional wavelet transform, spatial scalability and temporal/frame-rate scalability can be easily supported. Moreover, with the bit-plane coding of the subband coefficients, quality/SNR scalability is also enabled. Recently three-dimensional wavelet scalable video coding schemes with motion compensated temporal filtering (MCTF) has attracted more and more researchers. The research of this thesis also focuses on the three-dimensional wavelet scalable video coding schemes. The content of this thesis is introduced as follows:Firstly, we investigate how to make a good trade-off between the low-resolution mismatch error and the full-resolution coding performance in the overcomplete in-band MCTF (OIBMCTF) schemes. Aiming at the big challenge of OIBMCTF schemes, we first analyze the mismatch error propagation along the lifting structure when the low-resolution video is decoded and give the propagation model of this mismatch error. Then based on our analysis we propose two schemes to reduce the mismatch error. One is a frame-based mismatch error reduction scheme ----Cross-resolution leaky prediction scheme. The other is macroblock-based mismatch error reduction scheme----Mode-based MCTF scheme. Experimental results show that the proposed schemes can dramatically reduce the mismatch error for low resolution, while the performance loss is marginal for high resolution. These two schemes have been formally accepted by MPEG wavelet video coding ad-hoc group as the baseline schemes of IBMCTF of the reference software of three-dimensional wavelet video coding and can be used by MPEG members. Secondly,how to do motion prediction and coding efficiently in IBMCTF schemes is investigated. An efficient mode-adaptive motion prediction and coding algorithm is proposed. In our scheme, three motion prediction and coding modes are introduced to exploit the subband motion correlation at different resolution as well as the spatial motion correlation in the high frequency subband. By the rate-distortion optimized mode selection engine, the proposed scheme can adaptively decide the most efficient mode. When coding the motion information, we use context-based adaptive binary algorithm coding and design the corresponding probability models for motion prediction modes, motion alignment modes and macroblock partition modes to further improve coding efficiency. The experimental results show that the proposed scheme can improve the coding efficiency about 0.4-0.6db for CIF foreman sequence and 0.5-0.7dB for 4CIF soccer and city sequences at different bitrates, compared with subband-independent motion prediction method.Finally, we investigate how to combine the characteristics of human visual system with three-dimensional wavelet video coding schemes to improve the visual quality of decoded sequences. Aiming at"T+2D"scheme, we propose a perceptually-adaptive motion compensated temporal filtering (MCTF) method. Aiming at"2D+T"scheme, we propose a perceptually-adaptive in-band preprocessing scheme. In"T+2D"scheme, a spatio-temporal masking model in image domain is incorporated into the lifting structure of MCTF. The model is used to guide the motion search and the prediction step in MCTF so as to remove the visual redundancy in the video sequence. In"2D+T"scheme, a locally adaptive wavelet domain JND profile is first built which is then incorporated into a preprocessor of the in-band MCTF to remove the visually redundant coefficients before performing the MCTF of each spatial band. Experimental results show that the proposed schemes can efficiently enhance the visual quality of decoded video at different bit rates.
Keywords/Search Tags:Scalable video coding, three-dimensional wavelet transform, motion compensated temporal filtering, in-band motion compensation, in-band motion prediction and coding, perceptual scalable video coding
PDF Full Text Request
Related items