With the development of 3D modeling,human-computer interaction,and other researches in the metaverse,realizing accuracy depth estimation and 3D reconstruction of 3D scenes has become a hot research topic.Light field cameras can collect light intensity and angular information simultaneously,whose four-dimensional data contains depth values of the scenes.Light field has the possibility of achieving a truly immersive interactive experience.However,existing light field depth estimation algorithms have the defects of insufficient data utilization and inaccurate features at complex areas.This thesis proposes the feature extraction and attention mechanismbased depth estimation network(FEAMNet)to improve the accuracy in complex texture regions effectively.FEAMNet contains a dilatedconvolution and average-pooling(DCAP)feature extraction module with a large receptive field to acquire multi-scale features.Moreover,the channel attention-based disparity regression(CADR)module is introduced to measure the importance weights of different feature channels.It selects feature data with higher contribution to improve the accuracy and robustness of the algorithm.Because of the inevitable noise points and errors in depth maps,we propose an end-to-end parameter-learning neurall network(PLNNet)for the global optimization of depth estimation.The central LF sub-aperture image is applied as the guide image,which contains rich texture information.Meanwhile,the differentiable convolutional layers and the combined loss function are utilized to realize end-to-end network training.The proposed algorithm can reduce manual parameter adjustment and achieve the performance of edge-preserving smoothing,realizing global optimization of the depth map.The proposed PLNNet can effectively improve the accuracy and robustness of depth estimation.Nevertheless,3D point clouds generated from depth maps and scanning devices are followed along with noise,which will lead to poor results in subsequent 3D reconstruction.To remove noise,we use the combination of 3D guided filtering and statistical filter algorithm to denoise the input point cloud models,which can remove outliers and reduce information redundancy while preserving the edge and global features of the point cloud.The experimental results concerning light field depth estimation and optimization show that the proposed FEAMNet and PLNNet algorithms achieve the best performance in terms of mean square error and bad pixel rate,which obtain the most accurate disparity maps.Besides,our point cloud processing method can maintain the features of the 3D point cloud while removing outliers.Thus,this thesis generates accurate depth maps with light field images and obtains an accurate 3D point cloud model to promote the application research and development of three-dimensional reconstruction and human-computer interaction oriented to the metaverse. |