Font Size: a A A

Algorithms On The High Efficient Compression And Resource Allocation For Three-dimensional Video

Posted on:2016-03-24Degree:DoctorType:Dissertation
Country:ChinaCandidate:C GeFull Text:PDF
GTID:1108330482963574Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
Three-dimensional video (3DV), because it contains the distance (depth) information, can not only provide videos from different viewpoints allowing users to experience the three-dimensional visual perception, but also take advantage of multiview videos and associated depth maps sequences to synthesized a video from an arbitrary viewpoint within a certain range. Thus, in recent years, three-dimensional video-related technology is becoming increasingly attractive for many researchers.The data format of 3DV can be divided into multiview video (MV) format and multiview video plus depth (MVD). The former one allow users to experience limited three-dimensional visual perception, while the latter one utilize Depth Image Based Rendering (DIBR) technology to synthesized a video from an arbitrary viewpoint within a certain range. The former one is used widely because it has the advantage of less volume of data, higher quality of synthesized view, easy to compress as well as forward and backward compatibility. Since the data volume of the MVD is huge, it is difficult to transmit and store under the situation of current system of bandwidth and storage. Thus, how to compress original data, while not only meeting the bits restrains of transmission bandwidth or storage space but also maintaining the quality of reconstruction of encoded information, is becoming a daunting task for currently 3DV systems which needs to be addressed. Therefore, an efficient three-dimensional video coding technology is very important. For different coding standards, multiple video standards organizations have set up an ad hoc groups to study the three-dimensional video encoding particularly, where the International Organization for Standardization Moving Picture Experts Group (MPEG) and video coding experts Group (VCEG) have jointly developed a H.264/AVC based multiview video coding (H.264/MVC) standard and a high efficiency video coding (HEVC) standard based 3DV coding standard, including a HEVC based multiview video coding standard (MV-HEVC) and a HEVC based multiview video plus depth coding standard (3D-HEVC). Meanwhile, a large number of researchers conducted studies on this field and published a considerable number of professional literature. Under this situation, this paper focuses on the key three-dimensional video encoding technology, major contributions of this dissertation are summarized as follows:1. Temporal Subsampling based Depth Maps Coding and Reconstruction MethodThis dissertation first analyzes 3DV coding platform to propose two temporal subsampling based coding method. For the depth temporal subsampling based method for left and right non-base views, the initial depth reconstruction of discarded depth frames which is obtained through Motion Vector Fields (MVF), is classified into Depth Smooth Region (DSR) and Depth Non-smooth Region (DNSR). Then, we utilize different methods to reconstruct depth for different regions. For the depth temporal subsampling based method for the intermediate view, those discarded depth are reconstructed utilizing temporal consistency and multiview correspondences, and the reconstructed depth is improved based on wiener filter and original videos from intermediate view. In order to further improving the quality of the reconstructed depth, wiener filter is implemented on these two types of reconstructed results. Experimental results demonstrate that a maximum 0.388 dB peak signal-to noise ratio (PSNR) gain could be achieved by for the virtual view reconstructed while maintaining the same coding bit rate.2. Virtual View oriented Distortion Criterion and Lagrangian Multiplier based Depth maps codingWe propose a virtual view oriented distortion Criterion and a novel Lagrangian multiplier, and they are utilized into rate distortion optimization (RDO) procedure of depth maps encoding to improve the coding efficiency of depth maps.3DV comprises multiview videos and associated depth maps, in which depth maps are not used for viewing but for rendering virtual views. Therefore, take the distortion of virtual view into consideration, a modified distortion model and a virtual view oriented Lagrangian multiplier is utilized in the RDO procedure of depth maps encoding. Experimental results demonstrated the accuracy of the model. When incorporating the proposed model and Lagrangian multiplier into the mode decision procedure of joint model version 18.5 (JM18.5) of H.264/AVC, a maximum 0.458 dB BD PSNR and an average 0.258 dB BD PSNR can be achieved.3. Decoded MVD based Distortion Model of Synthesized Intermediate Virtual ViewsWe deduce a distortion model of the synthesized intermediate virtual view which is synthesized from its neighboring multiview videos and associated depth maps. In this dissertation, we introduce the Depth Quad-Tree Decomposition (DQTD) into distortion analysis to derive a relationship among the distortion of the synthesized virtual view, the compression distortion of the neighboring multiview videos and that of the associated depth maps. This relationship is described by a quadratic model named virtual view average distortion (VVAD) model. The WAD model can be utilized into RDO procedure of motion estimation and mode selection, also can be used in bitrate allocation between texture and depth information.4. Model based Joint bit allocation algorithm for multiview videos and associated depth mapsIn this dissertation, we propose a scheme for the joint coding rate allocation for multiview videos and associated depth maps. Firstly, for the relationships among the average distortion or sum bitrate of multiview videos and associated depth from two adjacent views and associated Quantization Parameter (QP), we derive a multiview videos Average Distortion-Quantization Parameter (ADQ) Model, an associated depth maps ADQ Model, a multiview videos Sum Bitrate-QP (SBQ) Model and an associated depth maps SBQ Model. Then, based on models mentioned above and the WAD model, the 3DVC bit allocation problem is converted as a constrained optimization problem, which is solved by a Genetic Algorithm to search the optimal QP pair. Experimental results demonstrate that since the bit allocation scheme takes the performance of synthesized view and bitrate utilization into consideration, the absolute difference between the constraint and the actual coding bitrates (referred to as "rate inaccuracy") of the proposed method is only 7.405% on average. Compared with these two methods, our proposed method can achieve a maximum 1.951 dB gain under the same bitrates constraint.
Keywords/Search Tags:three dimensional video coding, depth maps coding, the distortion of synthesized virtual view, joint bit allocation
PDF Full Text Request
Related items