Font Size: a A A

Deep Neural Network Based Depth Recovery From Multi-Modality Input

Posted on:2019-09-08Degree:MasterType:Thesis
Country:ChinaCandidate:Y XuFull Text:PDF
GTID:2428330548977454Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Depth estimation is a long-standing subject in computer vision field.Recently,the monocular depth estimation based on CNN(Convolutional Neural Network)has drawn intensive attention in academia.However,the depth predicted by the CNN in a monocular way always suffers from blurry and incorrect scale due to its inherently ill-posed property.In order to tackle such problem,we propose a multi-modality network based on CNN technic via multi-modality inputs,where we resort to extra sparse depth samples generated by stereo pair,structure-from-motion,or LiDAR sensors,etc.,since they can provide reliable sparse sample either via feature matching bounded by geometry constraint or directly via hardware sensor.Specifically,our proposed model consists of two phases,i.e.,the coarse depth prediction phase with RGB input,and the depth refinement phase with multi-modality input including RGB,sparse depth sample,and coarse depth prediction.Moreover,a residual learning method is introduced together with a novel sparse discriminator to promote the robustness to the defective input.Extensive experiments demonstrate that compared with the existing learning-based monocular method,our model achieves 0.39 meter and 0.85 meter decrease in terms of RMSE metric on NYU[1]and KITTI dataset[2]respectively and models better details than the previous state-of-the-art.The further applications on stereo matching refinement and LiDAR super-resolution further manifest the effectiveness of our approach on reducing the ill-posed property of monocular depth prediction problem.
Keywords/Search Tags:Depth estimation, Generative Adversarial Network, 3D vision
PDF Full Text Request
Related items