Font Size: a A A

Research On Multi-view Depth Estimation Method Based On Deep Learning

Posted on:2023-07-06Degree:MasterType:Thesis
Country:ChinaCandidate:J W YuFull Text:PDF
GTID:2568306812975749Subject:Engineering
Abstract/Summary:PDF Full Text Request
Multi-view(MVS)depth estimation uses camera parameters and images from multiple views to obtain a depth map for each view,and is one of the methods for recovering 3D spatial structure from images,which is used in professional fields such as heritage restoration and 3D printing.Deep learning-based MVS depth estimation uses a graphics processor to build a convolutional neural network,which is faster and more robust to complex environments.The MVS depth estimation based on deep learning is mainly based on two-dimensional convolutional structure in the feature extraction part,and the feature extraction part is influenced by the two-dimensional convolutional structure,so the feature extraction part can be modified to improve the accuracy.In addition,due to the high number of cost volumes and structural parameters for processing cost volumes,the cost volume construction and cost volume regularization can be improved to improve the algorithm efficiency.In summary of the two points mentioned above,the thesis designs algorithms for accuracy and efficiency,respectively.In order to improve the accuracy of MVS depth estimation,an MVS depth estimation algorithm based on self-attentiveness and ASFF(Adaptive spatial feature fusion)is proposed.The method uses adaptive spatial feature fusion as decoder to fuse features and strengthen the feature fusion capability of decoder.This method also adopts independent self-attention combined with CDSConv structure as the backbone network to extract features,which improves the global feature extraction capability.To further improve the effectiveness of regularization,the method uses a stacked hourglass network as the cost-volume regularization part.In order to reduce the graphics memory occupation and improve the efficiency of MVS depth estimation,this thesis proposes an MVS depth estimation algorithm based on lightweight convolution and ASFF.Experimentally,it is shown that the memory consumption and efficiency are mainly in the cost volume construction and cost volume regularization parts,so this method uses group correlation to compress the cost volume on the channel,and implements light-weight convolution to construct this part of the network,which not only reduces the space occupation of the cost volume and regularization network,but also improves the efficiency of the algorithm as a result.The MVS depth estimation algorithm based on self-attention and ASFF constructed in this thesis improves the feature extraction network based on CDSConv,combines CDSConv and self-attention,to make the feature fusion part combined with ASFF,and uses a stacked hourglass network,which improves the accuracy and completeness of MVS depth estimation to some degrees.The MVS depth estimation algorithm based on light-weight convolution and ASFF is constructed by applying light-weight convolution and group correlation on the basis of the CDSConv,verifying the feasibility of using light-weight convolution for cost volume regularization,and successfully using light-weight convolution to reduce the graphics memory occupation and improve the efficiency of the MVS depth estimation algorithm.
Keywords/Search Tags:Multi-view stereo, Depth estimation, Feature pyramid network, Lightweight convolution, Self-attention
PDF Full Text Request
Related items