Font Size: a A A

A Coding Scheme Based On Visual Perception For Multi-view Video Plus Depth

Posted on:2013-08-15Degree:MasterType:Thesis
Country:ChinaCandidate:X L ZhouFull Text:PDF
GTID:2248330362975234Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
As the modern information technology develops rapidly, interactive multimedia technologyhas become a new multimedia video technology development and application core. The traditionaltwo-dimensional video without depth information can not meet people’s needs. Providing thedepth information, visual realism and stereoscopic perception, three-dimensional video meets theinteractive needs of viewers, and more and more people have produced a strong interest inthree-dimensional video. Three-dimensional video technology has become a hot in current videoresearch areas, and it has a wide range of application, such as digital television, free viewpointvideo, remote education, remote communication, remote medical treatment, three-dimensionalvideo conferencing, virtual reality systems, security monitoring, and military combat simulation,etc.Multi-view Video/3D Video is the main three-dimensional video form, and its interactionfunction and stereoscopic perception has been concerned gradually, which reflects the network,interaction and realism in the next generation of multimedia applications. Through this technology,we can change the traditional way to enjoy video, from passive to active. For example, in oneshooting scene, taking the initiative to choose different angles, different perspectives, differentlevels of information, we can capture the areas of interest by our multi-camera and rendermulti-view synthesis by multi-angle, a full range of stereo vision.We have seen the advantages, but also can not ignore the problems in multi-view video. Thedata of multi-view video will linearly increase with the number of camera and it is far greater thanthe two-dimensional video. To solve this problem, scholars proposed a lot of encoding schemefrom a reasonable encoding complexity, moderate storage needs, flexible random accessperformance, etc., for achieving efficient multi-view video compression and transmission. Hence,the main code scheme focus on reducing the video space, redundancy of time and information andreducing the encoding complexity, but the video fast coding based on video content is relative lackof research, while ignoring the characteristics of the human eye stereo vision. The human visualredundancy has never been developed yet. For today’s multi-view video coding algorithm is basedon the evaluation criteria such as the PSNR (peak signal to noise ratio, Peak Signal to Noise Ratio),while the standard does not fully comply with human visual system. This paper presents a visualperception coding scheme based on region of interest, and the main contribution to the following:First, the current main region of interest methods for video sequences exist a common feature-do not take into account the depth chart, that can only be used for flat2D video objectsegmentation, and segmentation is not accurate with the huge amount of computation. This paperproposes a novel region of interest extraction method for left point of view. For the right point ofview, the extraction method is based on left view. First, the program combines the Sobel operatorextraction, OTSU extraction method, the background of the depth split method. By exploring thecharacteristics of their own, we get the ideal segmentation results. Meanwhile, the program’s mainoperation is a simple logic operation, addition and subtraction based on block, greatly reducing thecomputational complexity.Secondly, according to the traditional multi-view region of interest extraction algorithm, thispaper presents a reference solution based on HBP in the middle view. With the analysis ofcorrelation between videos, we found not only the video sequences, the region of interest extractedfrom both view have a strong correlation between the point of view, therefore, a extraction methodof low complexity, high precision based on region of interest in left and right view is proposed.Finally, today’s multi-view video coding algorithm is based on the evaluation criteria suchas the PSNR (Peak Signal to Noise Ratio), while the standard does not fully comply with humanvisual system. This paper uses the region of interest extraction model proposed before to guide themulti-view video coding and transmission, so you can skip this part of PSNR. According todifferent sensitivity in area of interest and the interest, this model used different encoding modewith close quality, greatly reducing the coding time. At the same time, it provides a good solutionfor real-time encoding.
Keywords/Search Tags:Subjective perception of the human eye, ROI extraction, multi-view video coding, depth, correlation between the viewpoints
PDF Full Text Request
Related items