Font Size: a A A

3-D video coding system with enhanced rendered view quality

Posted on:2012-01-13Degree:Ph.DType:Dissertation
University:University of Southern CaliforniaCandidate:Kim, Woo-ShikFull Text:PDF
GTID:1458390008491041Subject:Engineering
Abstract/Summary:
The objective of this research is to develop a new 3-D video coding system which can provide better coding efficiency with improved subjective quality as compared to existing 3-D video systems such as the depth image based rendering (DIBR) system. Clearly, one would be able to increase overall performance by focusing on better generic coding tools. Instead, here we focus on techniques that are specific of 3-D video. Specifically, we consider improved representations for depth information as well as information that can directly contribute to improved intermediate view interpolation.;As a starting point, we analyze the distortions that occur in rendered views generated using the DIBR system, and classify them in order to evaluate their impact on subjective quality. As a result, we find that the rendered view distortion due to depth map coding has non-linear characteristics (i.e., increases in intensity errors in the interpolated view are not proportional to increases in depth map coding errors) and is highly localized (i.e., very large errors occur only in a small subset of pixels in a video frame), which can lead to significant degradation in perceptual quality. A flickering artifact is also observed due to temporal variation of depth map sequence.;To solve these problems, we first propose new coding tools which can reduce the rendered view distortion by defining a new distortion metric to derive relationships between distortions in coded depth map and rendered view. In addition, a new skip mode selection method is proposed based on local video characteristics. Our experimental results show the efficiency of the proposed method with coding gains of up to 1.6 dB in interpolated frame quality as well as better subjective quality with reduced flickering artifacts.;We also propose a new transform coding using graph based representation of a signal, which we name as graph based transform. Considering depth map consists of smooth regions with sharp edges along object boundaries, efficient transform coding can be performed by forming a graph in which the pixels are not connected across edges. Experimental results reveal that coding efficiency improvement of 0.4 dB can be achieved by applying the new transform in a hybrid manner with DCT to compress a depth map.;Secondly, we propose a solution in which depth transition data is encoded and transmitted to the decoder. Depth transition data for a given pixel indicates the camera position for which this pixel's depth will change. For example in a pixel corresponding to foreground in the left image, and background in the right image, this information helps us determine in which intermediate view (as we move left to right), this pixel will become a background pixel. The main reason to consider transmitting explicitly this information is that it can be used to improve view interpolation at many different intermediate camera positions. Simulation results show that the subjective quality can be significantly improved using our proposed depth transition data. Maximum PSNR gains of about 2 dB can also be observed. We foresee further gains as we optimize the amount of depth transition data being transmitted.
Keywords/Search Tags:3-D video, Coding, Depth transition data, Rendered view, System, Quality, New
Related items