Font Size: a A A

Image Depth Estimation Based On Deep Learning And Its Application

Posted on:2020-02-17Degree:MasterType:Thesis
Country:ChinaCandidate:Y LuoFull Text:PDF
GTID:2428330572988966Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Scene depth estimation is an important topic in the field of computer vision.Using the depth information of the image,the three-dimensional structure information of the scene can be reconstructed,which is of great significance for tasks such as robot autonomous navigation,object recognition and grasp.The traditional visual depth estimation methods utilize the multi-view information of the scene to recover the scene depth from the two-dimensional images through the triangular geometrnc correspondence,which is computationally intensive and complicated.In recent years,with the development of deep learning,the use of convolutional neural networks to reconstruct scene depth has became a hot research direction for researchers.The convolutional neural network can be trained using image data and its corresponding groundtruth of depth data,achieving end-to-end full resolution image depth estimation during testing.The method is not only fast,but also simple to implement,and can realize scale recovery of the scene,which is beneficial to the space task execution of the robot.In this context,an innovative end-to-end deep learning network is proposed based on the deeply study of rthe depth estimation methods using convolutional neural network in recent years.The experimental results show that the proposed method can further improve the performance of the algorithm.This paper proposes an end-to-end learning scheme for predicting scaled dense depth map from sparse depth map and RGB image.In this scheme,a sparse depth map is generated by sparse sampling,and then a color image and a sparse depth map are input as a network,and a full-resolution depth image is output.During the training process,the sparse depth map is used as a supervised signal of the depth estimation network to recover the global scale of the scene.In order to estimate the depth of the scene accurately,this paper introduces the"correlation"layer to artificially simulate the standard matching process to fuse the sparse depth information and color image information,that is,the color information is used to improve the depth prediction accuracy based on the sparse depth map.Finally,the dense depth map is output at full resolution using the refinement module.The evaluation results on the NYU-Depth-V2 and KITTI datasets show that the model can reconstruct the depth of the scene at full resolution and has better performance than the state-of-the-art algorithms.This paper proposes a depth estimation network and a camera pose estimation network constructed in parallel.The camera pose estimation network takes a monocular video sequence as input and outputs a 6-DoF pose.The depth estimation network takes a monocular image as input and generates a dense depth map.Finally,based on the camera model,a synthetic view is generated and used as a supervised signal to jointly train two parallel estimation networks.Meanwhile,the sparse depth map generated by sparse sampling serves as another supervised signal for the depth estimation network to recover the global scale.The scale information obtained by the depth estimation network is shared to the pose estimation network through photometric error coupling.During testing,the depth estimator and the pose estimator can be used independently.The algorithm of this paper is experimentally evaluated on the KITTI dataset,which is superior to the state-of-the-art algorithms in many indicators.
Keywords/Search Tags:Depth estimation, Visual odometry, Deep learning
PDF Full Text Request
Related items