With the increasing demands for navigation,virtual reality,and unmanned driving,one of the core issues,simultaneous localization and mapping(SLAM),has received more attention.The first step in implementing SLAM is to obtain environmental information through sensors,such as using light detection and ranging(Li DAR)to obtain 3D point clouds,and using color cameras to obtain color images.Each type of data has its own advantages and disadvantages.Therefore,how to fuse multi-modal data to achieve higher-precision localization and mapping is a key research direction in the field of SLAM.Recently,some researchers introduced deep learning methods into the field of SLAM.Compared with traditional SLAM methods,SLAM methods based on deep learning do not require manual design of feature extraction methods,and have stronger robustness.Therefore,this paper uses deep learning methods to conduct an in-depth research on high-precision localization and mapping based on multimodal data.The main work of this paper is as follows:(1)Aiming at the problem that the current unsupervised SLAM methods only use singlemodal data,resulting in low localization accuracy,an unsupervised SLAM method based on3 D point clouds and color images is proposed.The method consists of three parts: unsupervised odometry based on 3D point clouds and color images,loop closure detection based on deep learning,and 3D color map construction.The inputs to the unsupervised odometry based on 3D point clouds and color images are continuous-time color images and multi-view depth images generated from 3D point clouds.The core of the odometry is a recurrent convolutional neural network for predicting the pose.The loss functions include 2D and 3D spatial loss functions.The input data are reconstructed using the predicted pose of the network.The 2D spatial loss function is the difference between the original color images and the reconstructed color images;the 3D spatial loss function is the difference between the original multi-view depth images and the reconstructed multi-view depth images.Deep learning-based loop detection uses a pretrained image classification model to find loops in trajectories,and uses loop information to optimize the predicted poses.A 3D color map is constructed by combining the color images and 3D point clouds with the optimized poses and color images.(2)Aiming at the problem that the current supervised SLAM methods only use singlemodal data,resulting in a single map information,a supervised SLAM method based on color point clouds is proposed.The method consists of three components: color point cloud odometry based on dynamic routing,loop closure detection based on the geometric features of the 3D point cloud,and 3D color map construction.The inputs of the color point cloud odometry based on dynamic routing are two color point clouds.The odometry consists of three parts: a hierarchical feature extraction network,a pose prediction network,and a pose optimization network.Using dynamic routing instead of max pooling in hierarchical networks preserves more feature information.The loop closure detection based on the geometric features of the 3D point cloud firstly rasterizes the 3D point cloud and converts it into a feature matrix,and then performs loop closure detection by comparing the similarity between the feature matrices.The predicted poses of the odometry are optimized using the loop closure information,and finally a 3D color map is generated. |