Font Size: a A A

Research On Monocular Depth Estimation Algorithm Based On Deep Learning

Posted on:2022-04-23Degree:MasterType:Thesis
Country:ChinaCandidate:Y F ZhouFull Text:PDF
GTID:2518306524998899Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Depth estimation technology is one of the most important technologies in environment recognition,which has great application value in the fields of autonomous driving and virtual reality.The traditional method to obtain the depth of scene is laser radar,but this method needs professional equipment.Laser radar has certain requirements on the ranging environment.Depth estimation technology based on deep learning is one of the basic tasks of computer vision,in which monocular depth estimation has the advantages of low cost and simple operation.Monocular depth estimation obtains depth of the scene from two-dimensional image taken by ordinary camera.Since the images taken at different depths may be the same and there is a many-to-one situation between scenes and images,the monocular depth estimation is a challenging task.The main work of this paper is summarized as follows:(1)Reviews and studies on the monocular depth estimation model based on deep learningThe monocular depth estimation model based on deep learning was proposed in 2014.According to the data types of the training model,the depth estimation model can be divided into single image training model,multi-image training model and auxiliary information optimization method.The single-image training model uses color-depth image pairs as training data,and the model uses the method of continuous depth regression to predict the depth map.The multi-image training model is divided into stereo image training model and image sequence training model.The depth information of stereo image training model is estimated by disparity map,while the depth of image sequence training model is predicted by reconstruction of motion structure.The method of optimizing the depth map with auxiliary information improves the accuracy of depth estimation by using information such as semantic tags.According to the different types of model training data,I classified and reviewed the existing depth estimation models.Then I compared and analyzed the advantages and disadvantages of different training data training models.(2)Proposed a depth estimation network based on spatial pyramid and pixel shuffleI proposed an end-to-end depth estimation model based on spatial pyramid and pixel recombination.Firstly,the global and local features of the input images are extracted from the backbone network,and then the features extracted from the backbone network are processed by the spatial pyramid module.At the decoding network,I designed the pixel shuffle decoding module,which uses pixel shuffle to integrate the global feature and the spatial pyramid feature,and improves the feature resolution.The depth map is generated after the four-pixel shuffle decoding module.The spatial pyramid module extracts the feature of each scale,and the pixel shuffle decoding module improves the feature resolution and retains all the feature information.Experimental results show that the proposed method enhances the characteristic information and improves the accuracy of depth estimation.(3)Proposed a depth estimation network based on multi-scale feature fusion and channel attention mechanismIn order to solve the problem of fuzzy boundary and missing small objects in depth maps,I proposed a depth estimation method based on multi-scale feature fusion and channel attention mechanism.The channel attention mechanism is used to the decoding network to enhance the ability to predict the global depth.The multi-scale feature fusion network up-samples the features extracted from the feature extraction network and fuses them with the output of the decoding network to enhance the local depth information of the depth map.Experiments show that the proposed method can enhance the ability of decoding network to predict global depth and improve the accuracy of depth map.
Keywords/Search Tags:Deep learning, Depth estimation, Attention mechanism, Multi-scale feature fusion
PDF Full Text Request
Related items