Font Size: a A A

Perception Based Three-dimensional Video Coding

Posted on:2015-07-27Degree:MasterType:Thesis
Country:ChinaCandidate:G J ZhangFull Text:PDF
GTID:2298330422993045Subject:Circuits and Systems
Abstract/Summary:PDF Full Text Request
Due to intense sense of immersion and realistic effects, Three-dimensional Video (3DV) isbeening pay more and more attention. The rapid development of circuit design, networktransmission, signal processing and video coding technology promotes3DV to be used in variousfields widely. As the most frequently scene description format for3DV, Multi-view Video plusDepth (MVD) not only has high requirements for network bandwidth, terminal storage capacity,but also needs efficient compression for viewpoints. The traditional methods mainly remove spatial,temporal, and inter-view redundancy of multi-view video, regardless of perception redundancy. Soresearchers apply perception features of Human Visual System (HVS) to further enhancecompression efficiency of video. But the current video perceptual coding standard does notconsider perceptual model’s complexity, and there is no reasonable multiview video perceptualmodel. In addition, video also gradually develops the high-resolution video. The great algorithmcomplexity of High Efficiency Video Coding (HEVC) aimed to high-resolution video limits thedevelopment of the standard. Therefore, the article carries out in-depth research onthree-dimensional video coding based on perception and algorithm optimization of HEVC.(1) Since the perceptual characteristics of human eyes are very complex, the perceptual modelbased on the perceptual features is particularly complex. However, many studies do not take thecomplexity of perception models into account. Responsing to the problem, the article divides videointo still, slow motion and vigorous motion regions according to the intensity of movement.Combining with the temporal correlation, the article establishes a rapid model used to obtainperceptual mask. Still regions can obtain the perceptual mask by copying the perceptual mask offront frame. The perceptual mask of vigorous motion regions is difficult to utilize the perceptualmask of front frame to predict accurately, so it can only be re-computated. The perceptual mask ofslow motion regions can use the front frame’s perceptual mask to predict. Experimental resultsshow that, contrasting to the conventional perceptual mask obtaining algorithm, the methodeffectively reduces encoding time by77.54%-84.60%under the premise of the peak signal toperceptual noise ratio (PSPNR) without decrease.(2) Researchers have proposed many perceptual models of single-view video, but somefeatures of multi-view video, such as binocular fusion, binocular competition, and binocularinhibition, decide it is difficult to extend the single-view video perceptual model to multi-viewvideo. To slove the problem, the article utilizes stereoscopic masking effect and Just NoticeableDifference (JND) model to establish an asymmetric stereoscope video perceptual model.Experimental results show that the bit rate of right view video reduces by11.45%-18.69%with thesame subjective quality of the reconstruction video. (3) In order to solve large amount of data for high-resolution video after compression, JointCollaborative Team on Video Coding (JCT-VC) is developing HEVC. The standard mainlyimproves the compression efficiency by increasing the complexity of algorithms appropriately,which limits the application of HEVC. Responsing to the high computational complexity of HEVC,the article proposes a method determinating the depth range of coding unit (CU) adaptively. Themethod takes advantage of spatial correlation to adaptively determine the most probable depths ofCU, so it can reduce the coding complexity. Meanwhile, the article utilizes probability densityfunction of Rate Distortion Cost (RDCost) for all CU and non-split CU in same layer to establish asuitable model, and determines RDCost corresponding to CU early termination according to the setof the video quality decrease. In addition, the article, analyzing the relationship between first modein candidate mode list (CML) and best intra prediction mode, proposes a redefinition method ofCML. It reduces the number of mode in CML to decrease the complexity of rate distortionoptimization (RDO). Experimental results show that the three methods can effectively decrease thecomplexity of intra coding, ensuring the quality of decoded video simultaneously.
Keywords/Search Tags:Three-dimensional video coding, Human perceptual characteristics, Just noticeable difference, Stereoscope masking effect
PDF Full Text Request
Related items