Font Size: a A A

Super-Resolution Estimation Via Spaarse Spatio-Temporal Rpresentation With Adaptive Regularized Dictionalries For Video Compression

Posted on:2013-01-09Degree:MasterType:Thesis
Country:ChinaCandidate:Z M PanFull Text:PDF
GTID:2218330362959332Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
Video compression (or video coding) is an essential technology for applications such as digital television, DVD-Video, mobile TV, video conference and Internet video streaming. Noticeably, increasing low-quality visual data from mobile phones, digital cameras and mobile TV, stimulate a huge demand for video analysis and computer vi-sion techniques, which can be applied to any prevalent standardized video coding as a scalability. It arises a big perspective whether more disruptive techniques can provide substantial gains. An impressive observation for video coding is to establish a certain correlation between a sampled low-resolution version and high-resolution contents. For example, scalable video coding maintains the spatial capability through down-sampling and inter-layer prediction with up-sampling. However, the coding burden is dominated by a a rigid computational complexity partition between encoder (heavy) and decoder (light). It would constrain the ubiquitous multimedia access for increas-ingly mobile communication. Ever since, distributed video coding (DVC) as a hopeful video coding paradigm motivated by shifting the computationally intensive predic-tion at the encoder to the decoder, accommodates the requirements of mobile camera phone and wireless visual sensor networks. Limited by the estimation of correlated side-information, practical DVC schemes often have a considerable performance loss compared with traditional predictive coding engines, e.g. H.264/AVC. Along the in-sight, it stimulates us to further investigate sparse adaptive inverse reconstruction with advanced regularity in a distributed video coding manner.Revisiting the traditional video coding schemes, e.g. H.264/AVC and the ongoing High-Efficiency Video Coding (HEVC), those focus on exploring redundancy among pixels through intra and inter prediction. It is well known that human visual system has very limited ability on identifying the detail of certain objects, e.g. repeated visual pattern or high frequency texture due to masking effects. It would, thus, give us pos-sibility to economize data involving with such regions, and the uncertainty from the coarse pixel-wise accuracy could be weakened by preserving the visual quality. As a matter of fact, more prediction methods, e.g. inpainting-based prediction, and texture prediction, have been noticed to achieve a better performance. It infers a promising potential to synthesize and hallucinate missing texture with good perceptual quality. By now, the attempts to restore the missing information have involved in various as-sistant side information, e.g. edge, and assistant parameters. To maintain a temporal consistency of video, a space-time completion has recently been referred in a global optimization sense.Naturally, more attention has been drawn to the possibility of video reconstruc-tion with state-of-the-art super-resolution approaches where a correlation between a sparsely sampled low-resolution version and high-resolution contents could be esti-mated in a nonparametric sense. Further control of the balance between image quality and bit rate in video compression can be achieved by down-sampling some frames of a video sequence at the encoder to acquire higher compression rates. Such multiple res-olution approaches are being used in many scalable video schemes. The sub-sampled frames have to be up-sampled at the decoder so as to achieve full resolution. How-ever, since some frame information is discarded at the encoder side during the down-sampling procedure, there are losses after interpolation. Hence the super-resolution reconstruction methods are used for increasing the quality of reconstructed video.This paper proposes a low bit-rate video coding scheme where sparse super-resolution estimations over dictionaries provide effective nonparametric approaches to inverse problems. A subset of key frames in a video sequence are encoded at high-resolution and serve as a set of training data at the decoder side, while the remaining frames are coded at low-resolution from down-sampling. It is recognized that the primitive patches of an image are of low dimensionality and can be well learned from the primitive patches across different images. Specifically, a video frame is divided into three layers:a primitive layer, a non-primitive coarse layer, and a non-primitive smooth layer. Considering that image primitives may vary significantly across differ-ent frames or different patches in a single frame, we propose to learn various sets of low-resolution/high-resolution subdictionary pairs from the primitive patches of the key frames. It is worth mentioning that non-primitive volumes are consistent along the motion trajectory, have little structure information, and have more sparse repre-sentations over a learned 3-D spatio-temporal dictionary. It is fulfilled by hierarchical bi-directional motion estimation and adaptive overlapped block motion compensation. Correspondingly, the target is formulated as an optimization problem by construct-ing a sparse representation of low-resolution frame patches or volumes over adaptive regularized dictionaries:a set of 2-D subdictionary pairs trained from 2-D primitive patches and a 3-D dictionary trained from non-primitive volumes. In reconstruction, the lost high-frequency information of the non-key frames can be synthesized from the sparse spatio-temporal representation over the adaptive regularized dictionaries. The related paper "Sparse spatio-temporal representation with adaptive regularized dictio-naries for super-resolution based video coding" has been accepted by IEEE DCC2012 (Data Compression Conference,2012).
Keywords/Search Tags:super-resolution, adaptive dictionary, primitive patch, sparse representation, spatio-temporal consistent
PDF Full Text Request
Related items