Font Size: a A A

Research On Key Technologies Of Semantic SLAM For Complex Dynamic Scenes

Posted on:2024-07-17Degree:MasterType:Thesis
Country:ChinaCandidate:W H ChenFull Text:PDF
GTID:2568306944968229Subject:Electronic Science and Technology
Abstract/Summary:PDF Full Text Request
As computer hardware performance improves,robots,autonomous driving,smart homes,space exploration and other fields gradually become research hotspots.Visual SLAM(Simultaneous Localization and Mapping),as a key foundational technology,estimates its own posture by perceiving the surrounding environmental information,while constructing multi-layer maps of the surrounding environment.It has become a current research hotspot.Under the extensive research of deep learning,deep learning-based algorithm models in computer vision have made great breakthroughs compared to traditional algorithms.Therefore,introducing deep learning into visual SLAM to form a semantic SLAM system to enhance its robustness and accuracy in complex dynamic scenes is an effective and widely adopted solution.However,the current fusion of deep learning and traditional visual SLAM is still in a loosely coupled state,which has not fully utilized the prior information of traditional visual SLAM systems,resulting in the following problems with visual semantic SLAM systems:These systems require multiple single-purpose deep learning models,leading to information redundancy and computation resources.The focus on the universal applicability of algorithms during the design phase often results in overlooking the a priori knowledge within the SLAM system itself,leading to a lack of customization and suboptimal accuracy and robustness.The utilization of global semantic features for dynamic map retrieval leads to inefficient retrieval processes and higher memory consumption.To solve the above problems,this thesis mainly completes the following four aspects of work:(1)Regarding the unstable tracking using feature matching in complex scenes at the front end of the system,the accumulation error caused by the optical flow algorithm,and the inability to provide feature points and feature descriptors at the same time.we construct a new feature point tracking Model(SPTNet-Super Tracking Network).This thesis verifies and compares SPTNet on indoor and outdoor datasets(KITTI,TUM),and experimental results show that SPTNet improves accuracy by 2%-15%.At the same time,the ablation experiment results show that the tracking accuracy has increased by 2%-5%,after introducing the small window matching module into other optical flow algorithms.(2)To address the issue of tracking interruption and the problem of deep iteration required for convergence in feature matching networks based on Transformer attention mechanism.We propose the Polar-Line Search on Transformer(PSOT),an efficient feature matching network for indirect visual simultaneous localization and mapping(SLAM)back-end that employs the use of epipolar constraints to limit Transformer attention.In experiments on public datasets such as TUM and Hpatches,the results show that compared with the classical Transformer matching network,our proposed PSOT model can increase the frame rate by 200%with a 50%reduction in parameters,without affecting the accuracy and recall rate.In addition,by combining PSOT with ORB-SLAM2 and conducting comparative tests with ORB-SLAM2 on EUROC datasets and actual scenes,the results show that the algorithm can improve the positioning accuracy of the system by 20%compared to ORB-SLAM3,demonstrating better robustness in low texture scenes.(3)To address the issues of slow retrieval speed and high memory consumption when using global semantic features for map retrieval,this thesis proposes a new spatiotemporal clustering retrieval algorithm.In the KITTI and Tokyo datasets,empirical results demonstrate that the proposed retrieval algorithm achieves an equal accuracy level to the brute-force semantic feature retrieval technique when utilizing only 10%of the memory required for storing the original semantic feature vectors.(4)To systematically validate the proposed algorithms presented in this thesis,we designed a complete semantic simultaneous localization and mapping(SLAM)system.This thesis conducts experiments on TUM and EUROC datasets,and the results show that the average positioning accuracy of the semantic SLAM system proposed in this thesis is improved by 40%compared to ORB-SLAM3.Real-world testing demonstrated the practicality of the system.
Keywords/Search Tags:visual slam, semantic features, feature matching, optical flow prediction, map retrieval
PDF Full Text Request
Related items