Font Size: a A A

Scalable video encoding, adaptation, and rate modeling

Posted on:2015-07-06Degree:Ph.DType:Thesis
University:Polytechnic Institute of New York UniversityCandidate:Xu, MengFull Text:PDF
GTID:2478390017989776Subject:Engineering
Abstract/Summary:
With the widespread high-speed Internet and mobile wireless networks, multimedia content, especially video content, is dominating the consumer network traffic. The users enjoying online video services such as video broadcasting or instant video communication usually have different connection bandwidth, different screen size on the device, and different demand or tolerance on video quality. To adapt to the heterogeneous connection conditions and different user demands, the scalable video coding extension of the H.264 standard (SVC) is a promising approach, where multiple bitstreams containing different temporal, quality, or spatial resolutions can be encoded into a single bitstream, and extracted later based on the demand.;SVC adopts the layered coding technique, where the base layer carries only fundamental information while the enhancement layers carry the refinement information to produce the enhanced quality. Compared with traditional single-layered video, SVC requires about 20% more bits to maintain the same reconstructed video quality. Moreover, the complexity grows linearly as the number of layers increases.;This thesis consists of two components. First we present solutions to reduce the SVC coding complexity without much loss in coding efficiency. Then we propose a rate model that can predict the bits needed for coding a block from its prediction error and the quantization stepsize. We further consider how to use this model to predict the total rate at different temporal layers. In the first part, we attack the SVC coding efficiency and complexity jointly. By analyzing the conventional encoding algorithm for SVC, we design a novel coding scheme by exploiting the correlation between the layers. In our approach, different quality layers of the same coding unit are forced to use the same mode and the same motion vector(s) if an Inter-mode is chosen. The mode and motion vectors are determined at the base layer only but using the information from the highest layer as well. By forgoing motion estimation and mode decision at higher layers, the complexity of enhancement layers is reduced to a negligible level, without much sacrifice in the coding efficiency. For some test sequence, the proposed scheme even achieves better coding efficiency, due to the fact that no mode and motion information need to be specified at higher layers.;To further reduce the coding complexity, we investigate the existing early Skip technique, and extend it with our unified Direct mode. The proposed early Skip/Direct (ESD) mode decision allows the computationally intensive mode decision to be bypassed if the ESD condition is satisfied. By exploring the quantization process in the video coding, we choose to use the averaged quantization error as the threshold. The ESD mode decision is further integrated with our multilayer mode decision for SVC, resulting significant complexity reduction with only slight coding efficiency degradation.;In the second part, we investigate the conventional rate model that relates the video bitrate with the prediction error and the quantization error. The conventional model relates the video rate linearly with the logarithm of the prediction error, which fails when the bitrate is low. We propose a non-linear model that is fitted from the collected data. We further show that the quantization error can be modeled by a power function with respect to the quantization stepsize. The complete model can predict the number of bits required for coding a block from its prediction error and the quantization stepsize. The model requires four parameters, which can be predicted by content features via light-weighted prepossessing.
Keywords/Search Tags:Video, Model, Coding, Rate, Error and the quantization, SVC, Content, Mode decision
Related items