Font Size: a A A

Multiple global affine motion models used in video coding

Posted on:2008-07-24Degree:Ph.DType:Dissertation
University:Georgia Institute of TechnologyCandidate:Li, XiaohuanFull Text:PDF
GTID:1458390005480652Subject:Engineering
Abstract/Summary:
The research presented in this dissertation explores a hybrid video codec's performance by simplifying its motion structure, instead of complicating it. This is in contrast to the latest compression standard H.264, and the majority of video researchers who are exploring more complex motion models.; Specifically, we propose to use global motion models instead of local block-wise motion vectors to compress motion information between consecutive frames. To cover the frequently occurring operations of rotation and zooming in global motion, a 6-D affine model is adopted instead of the more common 2-D translational one. To account for multiple motion objects in a video frame, motion segmentation is implemented based on the scalable motion field of an H.264 encoder. An affine model for each segment is estimated and used for global motion compensation of the corresponding areas. A warped reconstruction of the entire video frame is constructed using the segmentation map. The multiple affine models are predicatively compressed with a specially designed vector quantizer, which consists of a long main dictionary stored off-line and a short cache word list on line. The cache word list is searched for a match each time an affine model is quantized. The main dictionary is checked only when a "miss" happens. While reconstructing the current frame with multiple affine models, the proposed video codec system does not discard the classical block-matched reconstruction of each macroblock. Specifically, a macroblock can be reconstructed under any of the original H.264 inter or intra modes, or, with one of the affine models. Hence we add N affine modes to the original macroblock mode list of I4, I16, P16x16, P16x8, P8x16, P8x8, P4x4 and DIRECT, where N is the number of countable motion objects in the frame. One of the new modes is chosen by Lagrange optimization. By elongating the mode list and spending moderately more bits on mode indication, we save the encoder the prohibitive effort of transmitting a segmentation map to the decoder.; Finally we present the experiment results of our system, in comparison with the latest published version of JM, the H.264 codec reference software. Our system manifests a notable gain (up to 0.8 dB) in rate-distortion performance when the video stream bit rate is below 100 kbps. 30%-70% of the macroblocks in a P-frame end up being encoded by the affine modes. The proposed system also shows many other advantages over traditional codecs, such as less pronounced blocking artifacts and more error resilience.
Keywords/Search Tags:Motion, Video, Affine, Global, Multiple, System
Related items