Font Size: a A A

Research On Visual SLAM Based On Semantic Information In Dynamic Scenes

Posted on:2024-06-04Degree:MasterType:Thesis
Country:ChinaCandidate:Y X FuFull Text:PDF
GTID:2568307118951079Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
With the development of deep learning and the improvement of computing power,intelligent robots and unmanned driving technologies with deep interaction ability have gradually penetrated into many fields such as intelligent transportation,unmanned express industry,and intelligent services.Visual simultaneous localization and mapping(VSLAM)is a vital technology for robots to aware the surroundings and make decisions.However,the majority of visual SLAM algorithms suppose that the scene is motionless and hardly use the semantic information about environment.At the same time,the accuracy and stability in the dynamic environment are poor.Therefore,in order to meet the above challenges,the main research content of this paper is as follows:(1)The existing semantic segmentation algorithms lack pertinence for the segmentation of dynamic objects,resulting in low accuracy.Besides,most VSLAM algorithms also lack semantic segmentation thread.To solve the above problems,a semantic VSLAM algorithm based on attention mechanism in dynamic scenes is proposed,named CBAM-SLAM.First,according to the advantage that the attention mechanism can focus on important features in the input image,the CBAM attention mechanism module is added to the FPN layer of the Mask R-CNN network,which aims to increase the segmentation accuracy of dynamic targets.Secondly,a VSLAM model based on semantic information is built.The input image first obtains the mask of the dynamic target through the improved semantic segmentation network.And then the obtained mask is input into tracking thread as prior information to eliminates the feature points on the dynamic object.Finally,VSLAM system uses the static feature points for subsequent tracking and mapping.The experimental results show that compared with ORB-SLAM2,CBAM-SLAM reduce 55.38% in root mean square error and 58.54%reduce in standard deviation in high dynamic scenes.(2)The VSLAM system spends a lot of time in pixel-level semantic segmentation of dynamic objects,resulting in poor real-time performance.And the threshold value cannot change with the environment when extracting feature points,resulting in low robustness of the system.To solve the above problems,a lightweight semantic visual SLAM model based on adaptive threshold and speed optimization is proposed.First,the more lightweight one-stage target detection network YOLOv7-tiny is used to extract the dynamic target.And then the obtained dynamic target extraction frame is passed into the tracking thread to remove the dynamic feature points.Secondly,the visual SLAM system is optimized and accelerated.On the one hand,the adaptive threshold feature point extraction algorithm is used to improve the stability of the visual SLAM system while improving the efficiency of feature point extraction.On the other hand,the binary bag of words is used to replace the original bag of words,and the local mapping thread is pruned and optimized without greatly affecting the accuracy,so as to improve the operating speed of the system.The experimental results show that compared with ORB-SLAM2,the absolute trajectory error is reduced by 29.74%,and the relative trajectory error is reduced by 9.02%.In terms of running speed,the average processing speed has reached 19.8FPS,which can meet the real-time needs of practical application.
Keywords/Search Tags:SLAM, Dynamic Scenes, Attention Mechanism, Target Detection, YOLOv7
PDF Full Text Request
Related items