Three-dimensional(3D)light field display technology provides 3D images with true depth and correct occlusion by fitting the light field distribution of 3D scene,and gradually becomes an important development direction in the display field.To adapt to the hardware design of 3D light field display,it necessary to explore the corresponding content generation technology.Fast and high-performance 3D light field content generation technology provides rich contents,further expands the application of 3D light field display technology.3D light field content generation technology generates dense views from sparse view images,encodes and reconstructs stereoscopic display image to achieve the correct 3D display effect.Dense view synthesis algorithm needs to calculate the depth information of 3D scene,while stereo matching based on the binocular vision is one of the most important method for depth computation.As the irreversible problems in computer vision,stereo matching and dense view synthesis are also the key and difficult problems for 3D light field content generation.In this dissertation,the key technologies of 3D light field content generation based on binocular vision are investigated.The main research contents and innovations are as follows.(1)Stereo matching technology based on superpixel information refinementTo address the problems of unreliable matching in details,texture-less regions and occlusions,a stereo matching method based on superpixel information refinement,named SG-Stereo,is proposed.Superpixel is an irregular pixel set composed of neighborhood pixels with similar features,which provides rich information about object edge and context.Based on the multi-task learning framework,SG-Stereo learns the tasks of superpixel segmentation and disparity estimation with the superpixel branch and disparity branch,respectively.The disparity refinement based on superpixel information is realized by the interaction between the two branches.Novel Superpixel-Attention Spatial Pooling Pyramid(SA-SPP)module and Superpixel Guided Refinement(SGR)module are designed for the interaction between two branches.To alleviate the problem of fuzzy details in the process of feature sampling,superpixel pooling is designed to replace the average pooling in SA-SPP module.SGR module takes advantage of multi-scale superpixel information to refine the initial disparity estimation and further improve the performance.Experimental results demonstrate that the proposed SG-Stereo method can significantly improve the quality of disparity estimation,especially in details,textureless and occlusion regions.The introducing of superpixel information reduces the disparity errors of 30.1%and 21.3%on the Scene Flow and KITTI 2015 datasets,respectively.(2)Real-time stereo matching technology based on spatio-temporal consistency refinementExisting stereo matching methods run slowly and can not meet the real-time application requirements.To address this problem,a real-time stereo matching method based on spatio-temporal consistency refinement,named RST-Stereo,is proposed.This method utilizes a light-weight pyramid matching network to estimate initial disparity,and further refines the initial disparity with spatial consistency refinement(SCR)module and temporal consistency refinement(TCR)module.Based on the local spatial consistency of disparity,the disparity estimations of unreliable regions are further refined with the neighborhood high-confidence predictions in SCR module.TCR module further refines and constrains the disparity estimations with the motion information between adjacent frames.Experimental results demonstrate that the proposed RST-Stereo can achieve high-quality disparity estimation with a real-time running speed of more than 40 FPS.The introducing of spatio-temporal consistency refinement reduces the disparity errors of 44.5%and 32.4%on the Scene Flow and KITTI 2015 datasets,respectively.(3)Dense view synthesis technology based on depth informationDense view synthesis includes two key problems,virtual view synthesis and dense-view parameters design(including view number and view resolution).To address the first problem,a depth-based virtual view synthesis method of real scenes,named R-DIBR,is proposed.This method synthesizes virtual views with four steps,including checkerboard rectification,stereo matching,pixel mapping,and cavity filling.For the problem of dense-view parameter design,since the total resolution of 3D light field display is fixed,the increasing of view number reduces the single-view resolution.The view number affects the spatial smoothness of 3D light field display,and the view resolution affects the display definition,both of which are important for 3D light field display.Therefore,the design of dense-view parameters directly affects the performance of 3D light field display.To address this problem,a dense-view parameters design method based on visual perception is proposed.This method utilizes the subjective evaluation of 3D light field display quality with different parameters to reverse optimize the parameters design.Experimental results demonstrate that the proposed R-DIBR method can synthesizes high-quality virtual images of real scene,and the parameters design method further improves the image quality of 3D light field display. |