| Semantic segmentation is one of the important research directions in computer vision,which aims to differentiate categories of scenes from the pixel level and assign specific semantic information to different categories.Applying semantic segmentation models to vision SLAM systems enables real-time detection and rejection of dynamic objects in the environment and improves the robustness of robot navigation systems.The low memory resources and poor processor performance of mobile robots make it difficult to deploy the current semantic segmentation models with high accuracy on them.Therefore,in this paper,we design a lightweight semantic segmentation model from the starting point of improving the model’s synthesis and apply it to the core algorithm involved in the mobile robot navigation system,visual SLAM.The main research results of this paper include:(1)To address the problem that the current mainstream semantic segmentation model is not highly accurate,this paper improves on the Lite Seg semantic segmentation model and proposes the GL-MSE semantic segmentation model.In order to improve the segmentation accuracy of the network,a feature pyramid module with a Res2 Encoder structure is proposed,which can acquire vector features at different resolutions and fuse these features.Secondly,this paper uses the Transformer encoder module,which is able to combine the accuracy of the global transform network and the scalability of the convolutional structure with the help of the covariance matrix of key and interrogation values between channels.Experiments show that the GL-MSE model has excellent segmentation accuracy.(2)To address the problem that high-precision semantic segmentation models are difficult to be deployed on robots in real-time,this paper lightens the GL-MSE model in terms of the design of convolutional modules and effective feature transmission.By analyzing and comparing different lightweight convolutional modules,the Mobile One module is selected as the feature extraction module in the backbone network of the semantic segmentation model considering the complexity of the operation and the number of model parameters.Meanwhile,to ensure the feature extraction ability of the model,the CA attention mechanism is introduced to obtain more efficient feature information.The experiments show that the improved model can meet the requirements of real-time processing while ensuring accuracy.(3)A visual SLAM system based on ORB-SLAM3 is proposed,and the tracking threads of the SLAM system are improved by combining the lightweight GL-MSE semantic segmentation model proposed in this paper to perform dynamic feature point rejection,map building and simulation experiments.Comparing the trajectory images of the model,as well as quantitatively analyzing the absolute trajectory error and relative positional error of this paper’s model with ORB-SLAM3 model and RDS-SLAM model,it is proved that the algorithm of this paper has obvious improvement in terms of accuracy,global consistency of trajectory and drift of visual odometry.The lightweight GL-MSE semantic segmentation model proposed in this paper improves the segmentation accuracy by methods such as feature pyramid and Transformer encoder,and ensures real-time and feature extraction capability by lightweight design and attention mechanism,meanwhile,the model can be well applied to visual SLAM systems to detect and reject dynamic feature points in the environment in real-time and improve the robustness of the system in dynamic environment. |