Font Size: a A A

Research On Key Technology Of Light-field Content Generation Based On Scene Geometric Structure

Posted on:2023-09-05Degree:DoctorType:Dissertation
Country:ChinaCandidate:S QiFull Text:PDF
GTID:1528306944964189Subject:Electronic Science and Technology
Abstract/Summary:PDF Full Text Request
Compared with traditional 2D display technology,3D light field display technology can display content with spatial depth information by restoring the original light field distribution of the real scene.The display effect is more realistic and immersive,which is one of the most important directions for the development of display technology in the future.In order to provide display content,in the process of light field acquisition in real scenes,due to the limitations of camera size and transmission bandwidth,the acquired data is usually sparse view images.Because the disparity of the sparse view image sequence is not smooth enough,it is necessary to use the 3D light field content generation method to synthesize virtual view images between sparse views from sparse view images and finally generate a dense view image sequence with continuous smooth disparity.Light field content generation technology can provide more display content for 3D light field display devices,which is conducive to promoting the wide application of 3D light field display technology.However,the current light field content generation technology still has some problems,such as low robustness and inaccurate occlusion scene generation results.With the help of the scene geometry information,the accurate mapping relationship between the corresponding pixels in the multi-view images can be constructed,which can guide the generation of high-quality light field content.Therefore,this thesis focuses on the 3D light field content generation technology based on scene geometry structure data.Firstly,the method of fine generation of scene geometric structure is designed.Then,based on the geometric structure information of the generated scene,a robust enhancement method and an occlusion region quality enhancement method for virtual view synthesis are proposed.The main research contents and innovations are as follows:(1)Scene geometric structure refined generation method based on unsupervised learningIn order to relieve the problem that the existing scene geometry generation methods are highly dependent on the depth supervision data,a scene geometry generation method based on unsupervised learning is proposed.This method uses the reprojection error of the input image as the constraint function,constructs an unsupervised neural network parameter optimization method that does not depend on the depth supervision data,and solves the problem of dependence on the depth supervision data.In order to relieve the problem of the low resolution of the depth map generated by existing methods,an iterative refinement generation method is proposed.By building a multi-stage matching cost volume,the depth search space is gradually narrowed to reduce the memory consumption,and the saved memory space is used to improve the resolution of the depth map.In addition,the feature vector correlation enhancement module is designed to reduce the dimension of data by introducing the correlation information between multiple input views,which reduces memory consumption and improves the accuracy of the generated results further.By introducing the iterative refinement generation method,the average error of the algorithm on the DTU dataset is reduced by 31.58%.Without additional memory consumption,this method can increase the resolution of the generated depth map by four times and generate scene geometry with rich details.(2)The robust enhancement method for virtual view synthesis based on feature fusionIn order to relieve the low robustness of existing virtual view synthesis methods when the input sparse view baseline is wide,or the rotation angle is large,a feature fusion based virtual view synthesis robustness enhancement method is proposed.The sparse view color image is converted into a feature map through the neural network.Then the feature maps are mapped to the virtual view with the guide of scene geometric structure information,fused into a virtual view feature map,and finally rendered as a virtual view color image.The robustness of the virtual view synthesis method is improved by explicitly introducing the multiview projection process into the neural network computing framework.The feature reprojection aggregation module is constructed to implicitly modeling the multi-view information fusion process,thus improving the quality of virtual view synthesis.In addition,to relieve the problem of reducing the accuracy of scene geometry generation caused by the loss of virtual view color images,a strategy of selecting input views that balances the relationship between disparity and the same observation area is proposed,which improves the accuracy of scene geometry.The algorithm has been evaluated and tested on Middlebury and Tanks&Temples datasets.Compared with the DIBR method of directly mapping pixels,the algorithm can improve the SSIM and PSNR of the generated virtual view by 5.25%and 3.73 dB by introducing the feature fusion method.(3)Quality improvement of occluded regions in virtual view synthesis based on occlusion perceptionIn order to solve the problem that the existing virtual view synthesis methods using 2D images as input data are difficult to accurately judge occlusion relationships,which leads to the problem that the generated results are prone to aliasing errors in the occluded areas,a method for improving the quality of occluded areas in virtual view synthesis is proposed.This method introduces a structured light sensor on the basis of a color camera,generates scene geometric structure by truncated signed distance function,accurately judges the occlusion relationship between foreground and background under the virtual view,and guides the virtual view synthesis method to generate accurate results in occluded areas,thus improving the quality of virtual view image synthesis in occluded areas.The indoor real scene test results show that this method can effectively improve the accuracy of the generated virtual view image in the occluded area by accurately distinguishing the occlusion relationship between the foreground and background.Compared with the NeRF method based on a neural radiation field,the generated SSIM is improved by 4.58%,and the PSNR is improved by 3.90 dB.
Keywords/Search Tags:Light-field display, Virtual view synthesis, Scene geometric model reconstruction, Unsupervised learning, Multi-view stereo
PDF Full Text Request
Related items