Font Size: a A A

Design And Implementation Of Unsupervised Neural Networks Based Structure From Motion Algorithm

Posted on:2021-03-03Degree:MasterType:Thesis
Country:ChinaCandidate:H S LiFull Text:PDF
GTID:2518306308469834Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Structure from motion,which is to infer the 3D information of the scene from the 2D image series,is attracting significant attention of academia and industry for potentially wide computer vision applications,including localization and navigation systems and autonomous driving platforms,etc.However,for the scene to be estimated,there are often regions that are difficult to extract features such as occlusion areas,weak texture areas,etc.,making the task of estimating structure from motion quite challenging.Therefore,it is still difficult to get more accurate and generalized structure from motion estimating algorithms.The traditional SFM algorithm has a poor effect.The emergence of deep convolutional neural networks has greatly promoted the development of various sub-fields of computer vision.Learning the depth of the scene and the pose of the camera from the video sequence through the deep convolutional network have become the Research hotspots of the field.Algorithms based on supervised neural networks use ground truth collected by radar for supervised learning of the network,can get the best state-of-the-art result of the peoblem.However,these algorithmes are difficult to apply to other scenarios,and small changes of the scene will greatly affect the performance of the algorithm.Therefore,in recent work,unsupervised neural networks have gradually attracted researchers' attention.Most prior work in unsupervised depth learning use monocular video sequences as the input of their networks.However,their results need a scale factor that is computed frame-to-frame to maintain a stable relative scale.In this paper,we propose an unsupervised learning framework for the task of joint depth and ego-motion estimation from stereo sequences.The usage of stereo sequences can provide both spatial(left to right)and temporal(forward to backward)photometric warping constrains for supervised learning and allow for an absolute scale factor for the scene depth and camera pose,which is of great significance for vison guidance.Experiments on the KITTI driving dataset reveal that our framework outperforms state-of-the-art results employing unsupervised neural networks.
Keywords/Search Tags:depth estimation, pose estimation, unsupervised neural network, confident map
PDF Full Text Request
Related items