Font Size: a A A

Statistical Inference Model And Its Applications To Video Coding

Posted on:2011-05-19Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y B ZhangFull Text:PDF
GTID:1118360332956451Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the boom of informaiton technology and networks, new generation high efficient video coding techniques have been one of the most active research areas in the academic and industrial societies in recent years. The compression performance of current mainstream standards, H.264/AVC and AVS, has increased more than one time than other existing ones. However, these compression techniques still can not achieve satisfactory results and the bitstream is susceptible to transmission errors under limited network bandwidth. Consequently, how to further improve the efficiency of video compression as well as improve the efficiency of error concealment under limited network bandwidth has attracted more and more attention. From the signal processing point of view, the input signal can be down sampled (in spatial or temporal domain) and compressed, the decompressed down sampled signal is then up sampled (it is termed interpolation in spatial domain and frame rate up conversion in temporal domain) to maintain the resolution or frame rate of the input signal. The compression efficiency of this method is greatly influenced by the performance of frame rate up conversion and interpolation.From the information theoretic point of view, frame rate up conversion, interpolation and error concealment can be viewed as specific statistical inference process: estimation, prediction of some unknown quantity from known observation. The efficienty of inference depends on the accuracy of the model employed to represent the source signal. Translational motion model is employed to describe the redundancy between successive frames in traditional frame rate up conversion and error concealment. However, translational motion can not accurately capture the local image properties, which will cause the mismatch between source signal and the precited one. In this dissertation, frame rate up conversion, interpolation and error concealment are investigated from the statistical inference point of view. The detailed contents of this dissertation are as follows:First, most traditional frame rate conversion methods try to find the accurate motion information for the to-be-interpolated frame employing translational motion model. However, such methods can not characterize the local image properties. To improve the performance of frame rate up convertion, this dissertation proposes a spatio-temporal auto regressive (STAR) model based frame rate up conversion. In the STAR model, each pixel in the interpolated frame is approximated as the weighted combination of a sample space including the pixels within its two temporal neighborhoods from the previous and following original frames as well as the available interpolated pixels within its spatial neighborhood in the current to-be-interpolated frame. To derive accurate STAR weights, an self-feedback weight training algorithm is proposed. In each iteration, first the pixels of each training window in the interpolated frames are approximated by the sample space from the previous and following original frames and the to-beinterpolated frame. And then the actual pixels of each training window in the original frame are approximated by the sample space from the previous and following interpolated frames and the current original frame with the same weights. The weights of each training window are calculated by jointly minimizing the distortion between the interpolated frames in the current and previous iterations as well as the distortion between the original frame and its interpolated one. Experimental results verify that STAR model is able to improve the quality of interpolated frames both in objective and subjective criterion, especially for the regions full of details.Second, in the STAR model the temporal neighborhoods are centered at the collocated pixels in the previous and following frames. Consequently, for the sequences with moderate or large motions, it will be difficult to accuratly capture the local image property. To overcome the shortcomings of the STAR model, this disseration proposes a motion alignted auro regressive (MAAR) model based frame rate up conversion. In the proposed MAAR, each pixel is interpolated as the average of the results generated by one forward MAAR (Fw-MAAR) model and one backward MAAR (Bw-MAAR) model. In the Fw-MAAR model, each pixel in the to-be-interpolated frame is generated as a linear weighted summation of the pixels within a motion aligned square neighborhood in the previous frame. To derive more accurate interpolation weights, the aligned actual pixels in the following frame are also estimated as a linear weighted summation of the newly interpolated pixels in the to-be-interpolated frame by the same weights. Consequently, the backward aligned actual pixels in the following frame can be estimated as a weighted summation of the corresponding pixels within an enlarged square neighborhood in the previous frame. The Bw-MAAR is performed likewise except that it is operated in the reverse direction. A damping Newton algorithm is then proposed to compute the adaptive interpolation weights for the Fw-MAAR and Bw-MAAR models. Experimental results verify that the MAAR model is able to achieve more robust results compared with STAR model.Third, this disseration proposes an auto regressive (AR) model and applies it to error concealment for block-based packet video encoding utilizing weighted least squares method. In the proposed method, the best motion for the corrupted block is first derived. Each pixel within the corrupted block is restored as the weighted summation of pixels within a square centered at the pixel indicated by the selected motion vector in a regression manner. Two novel AR coefficient derivation algorithms under spatial and temporal continuity constraints are proposed. First, we present a coefficient derivation algorithm under the spatial continuity constraint, in which the summation of the weighted square errors within the available neighboring blocks is minimized. The confident weight of each sample within the neighboring blocks is inversely proportional to the distance towards the corrupted block. Second, we provide a coefficient derivation algorithm under the temporal continuity constraint, where the summation of the weighted square errors within an enlarged motion aligned block in the previous frame is minimized. The confident weight of each padded sample is inverse proportional to the distance between the corresponding sample and motion aligned block. The regression results generated by the two algorithms are then merged to form the ultimate restorations. Experimental results verify that the proposed method is able to improve the quality of restored blocks.Fourth, traditional image down sampling methods committed to remove the aliasing artifacts of down sampled image. However in these methods, the influences on the high resolution image quality brought by up sampling (interpolation) are usually neglected during down sampling. This dissertation proposes an interpolation dependent image down sampling (IDIDS) algorithm, which can generate an up sampled image with high visual quality. In IDIDS, we derive the optimal down sampled image by minimizing the sum of square errors between the input image and the corresponding interpolated one. In IDIDS, least squares (LS) method can be used directly to obtain the down sampled image for interpolation methods with fixed interpolation coefficients. For the adaptive interpolation methods with varying interpolation coefficients, we devised a content dependent IDIDS algorithm, which exhibits superior performance. To reduce the space and computational complexity of the IDIDS, we also provided a block wise implementation, which is very efficient for practical applications. Experimental results verify that the proposed method is able to significantly improve the quality of up sampled image.
Keywords/Search Tags:Video coding, frame rate up conversion, error concealment, statistical inference, auto regressive, local image property
PDF Full Text Request
Related items