Font Size: a A A

Research On Key Technologies Of SLAM System Based On Deep Learning

Posted on:2023-09-10Degree:MasterType:Thesis
Country:ChinaCandidate:C Y CaiFull Text:PDF
GTID:2558306914479464Subject:Electronic Science and Technology
Abstract/Summary:PDF Full Text Request
As autonomous driving technology and robotics have become the focus of attention in recent years,SLAM(Simultaneous Localization and Mapping)technology is one of its core technologies.SLAM algorithms are widely used to solve robot localization problems.But in complex environments where dynamic objects and significant lighting changes exist,the traditional methods can establish wrong feature associations,resulting in poor localization accuracy of the system.Therefore,the problem of feature matching and data association of non-geometric modules in the system has become a bottleneck that limits the accuracy of the algorithm and the application of SLAM systems in real life.For the visual SLAM system in complex environments,this paper delves into the key technologies of visual feature extraction,feature matching,and visual relocalization based on deep learning.Then we construct a visual SLAM system for complex environments.The main work and innovations accomplished in this paper are as follows:Firstly,aiming at the problem that the traditional visual feature measurement capability is not robust enough in complex environments,this paper proposes a visual feature detector that combines traditional keypoint and deep learning descriptors.Extracting visual features is the first step of the SLAM.The system uses the visual features to establish the matching relationship between images,then solves the relative poses between the images.However,in complex scenes,the visual features are not robust enough to establish a reliable feature matching relationship,and the relative pose accuracy of the solution cannot meet the needs of precise positioning.The feature detector based on traditional keypoint and CNN(Convolutional Neural Networks)descriptor proposed in this paper can establish a more accurate and reliable matching relationship.In the evaluation of the Hpatches dataset,the mAP(mean Average Precision)of our method is on average 1.73 percentage points higher than that of SOSNet.Secondly,aiming at the problem of over-parameterization of neural network models,this paper uses NAS(Neural Architecture Search)on the original model to obtain a lighter network to achieve the purpose of pruning the redundant structure of the network.At present,most neural network models take up too many computational resources at runtime,and most studies believe that neural network models are over-parameterized,so it is possible to prune the structure of the network while still keeping that the model has good performance.At the same time,this paper uses the network compression and layer fusion technology based on TensorRT,which further reduces the network inference time.Tested on the NVIDIA Jetson AGX Xavier platform,the optimized model in this paper reduces inference time by 83.7%while maintaining similar accuracy.Thirdly,for the traditional visual SLAM system cannot obtain accurate localization results in complex scenes,this paper combines the above research results to build a SLAM system based on traditional keypoints and CNN descriptors.Meanwhile,in order to improve the robustness of SLAM in complex scenes,this paper optimizes the performance of feature extraction and back-end optimization processes in the SLAM system process,and the system is still able to run at 27 frames per second even on edge computing devices.Evaluated on three datasets,Euroc,TUM-VI and TUM-RGBD,compared with ORB-SLAM2 and VINS-Mono,the proposed system in this paper reduces the RMSE(Root Mean Square Error)of localization by 35.8%and 24.4%.
Keywords/Search Tags:deep learning, convolutional neural network, visual feature descriptor, visual SLAM, visual-inertial odometry
PDF Full Text Request
Related items