Font Size: a A A

Research On Simultaneous Localization And Mapping Based On Deep Learning In Outdoor Scenes

Posted on:2023-10-28Degree:MasterType:Thesis
Country:ChinaCandidate:Y KuangFull Text:PDF
GTID:2558306914981919Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Visual Simultaneous Localization and Mapping(VSLAM)using visual sensors,as one of the key technologies of autonomous equipment,is widely used in unmanned aerial vehicles,unmanned vehicles,virtual machines,augmented/virtual reality due to its simple structure and low cost.By analyzing position changes of the features in consecutive images,traditional VSLAM calculates the relative poses,outputs full camera trajectories and the maps of surroundings.However,in the case of pure rotation or small movement,these methods decrease the accuracy of the results due to lack of depth information.The learning based on convolutional neural network,with its excellent feature capture and scene learning capabilities,can assist the existing monocular VSLAM to obtain the depth of the corresponding feature points,and significantly improve the robustness and accuracy of the system.SLAM based on deep learning has important research value.The main tasks of this paper are as follows:(1)This paper proposes a novel self-supervised teacher-student network framework to address the scale inconsistency in monocular depth prediction.The deep network learns a prediction model of the pixel depth distribution from the sample set,and outputs the depth prediction map and its uncertainty map for a given single-frame image.The reliability of the depth prediction value can guide the structure recovery and pose calculation of subsequent spatial points.The simulation results show that the teacher-student framework proposed in this paper has the most accurate depth estimation on the Kitti dataset.(2)Aiming at the mismatch of features,this paper proposes a secondary screening method based on spatial structure features.The 2D feature point set adopts a screening method based on motion trend statistics;combined with the depth prediction value outpu t by the teacher-student network and its uncertainty,for the 3D feature point set,this paper proposes a mask-based screening method to obtain Feature points with higher depth prediction quality.The resulting two subsets of matching feature points are used for subsequent pose calculations.(3)In order to integrate the deep learning model into the traditional VSLAM framework,this paper presents a motion-decoupled odometry framework scheme.This framework selects different feature point sets to calculate the rotation and translation respectively:the rotation adopts the traditional epipolar geometric method to avoid the rotation error caused by the introduction of the depth value;the translation adopts the predicted value of the filtered monocular depth,through the construction A minimal reprojection optimization problem to solve.This paper verifies the state-of-the-art of the odometer based on the motion decoupling framework on the Kitti dataset.
Keywords/Search Tags:visual SLAM, depth predictions, motion decoupling
PDF Full Text Request
Related items