Font Size: a A A

Monocular Image Depth Estimation Based On Deep Learning

Posted on:2021-03-24Degree:MasterType:Thesis
Country:ChinaCandidate:S T ZhangFull Text:PDF
GTID:2428330629484684Subject:Instrument Science and Technology
Abstract/Summary:PDF Full Text Request
Depth information plays a very important role in understanding the threedimensional geometric relationship in the scene.Accurate and effective depth information can improve the accuracy of many computer vision tasks.Most of the current depth information is obtained through depth sensors,but these methods have limited use scenarios and high hardware costs,which limits their widespread use.Therefore,the research of obtaining depth information directly from images without the help of depth sensors has attracted more and more attention.When the camera is imaging,it projects the three-dimensional space onto the image plane,which will inevitably cause the loss of depth information.The typical depth acquisition methods are mostly achieved through stereo matching of multiple images,and obtaining depth information directly from a single image has long been regarded as a morbid problem and is difficult to achieve.In recent years,with the continuous improvement of computer performance and the continuous development of deep learning,the super feature extraction ability of convolutional neural networks has provided new ideas for the problem of monocular image depth estimation,and many good methods have emerged.However,the depth maps predicted by these methods still have the problems of unclear edge contours and unsmooth depth changes.And most of these methods use complex graph theory inference to optimize the final depth map.Therefore,this paper proposes the following three improvement methods for these problems:1.In order to better extract the multi-scale features in the input image,this paper proposes a feature extraction network structure based on the combination of dense connection network(Dense Net)and expanded space gold tower pooling structure(ASPP).,Solve the problem that the deep network is prone to vanish gradient,explosion and model degradation.ASPP achieves the extraction of multi-scale features in images through the introduction of dilated convolution.The two are applied together to the image encoder-decoder structure,so that the entire network can achieve end-toend training.2.In order to solve the problems of unclear edge contours and unsmooth depth changes in depth maps,this paper transforms the monocular image depth estimation problem into multiple continuous binary classification problems.Compared with the methods of direct regression and multi-classification,the method of using multiple twoclass classification can effectively use the rule that the depth information in the scene is distributed from near to far.Therefore,pixels with similar depths in the image can be less disturbed by noise,the predicted depth does not jump,and finally the effect of smoothing the depth map can be achieved.3.In order to solve the problem that the methods such as Conditional Random Field(CRF)and other methods are complicated and difficult to combine with Convolutional Neural Network(CNN),this study introduced virtual normal vectors to optimize the final depth map.Because the virtual normal vector is calculated by the virtual plane composed of three points in the three-dimensional point cloud,it can reflect the spatial geometric relationship in the scene to a certain extent.Therefore,by minimizing the difference between the virtual normal vectors corresponding to the real depth map and the predicted depth map,the purpose of optimizing the depth map can be achieved.Through training,verification and testing on three typical depth data sets,NYU Depth V2,Make3 D,and KITTI,the experimental results show that the monocular image depth estimation method proposed in this study obtains a clear image with clear edges,distinct layers,and indoors It can be applied in both scenes and outdoor scenes,has a strong generalization,and can meet the needs of a variety of practical scenarios.
Keywords/Search Tags:monocular image depth estimation, deep learning, multi-scale feature extraction, virtual normal vector, binary classification
PDF Full Text Request
Related items