With the arrival of artificial intelligence,computer vision has been significantly developed,from the traditional method based on machine learning to the method based on deep learning.As an important research direction of computer vision,point cloud learning plays an important role in many fields of 3D computer vision,such as unmanned driving,robot navigation and 3D scene reconstruction.Point cloud learning mainly includes 3D point cloud classification,3D point cloud object detection and tracking,and 3D point cloud semantic segmentation.The traditional Simultaneous Localization and Mapping(SLAM)system uses the relevant features between 2D images for localization,navigation and mapping.With the improvement of acquisition equipments,SLAM system based on point cloud learning has gradually achieved rapid development at present,such as slam system based on lidar.However,there is still much room for the development of SLAM system combining 2D and 3D imformation.Therefore this paper mainly studies the semantic segmentation technology of 3D point cloud to obtain the information at the point cloud level,which can be used to optimize traditional SLAM system in the future.The main work of this paper is as follows:Firstly,this paper starts with the feature location information encoding to improve the baseline.This paper does research on the semantic segmentation algorithm of 3D point cloud based on deep learning and takes RandLA-Net as a baseline.The original RandLa-Net directly combines the relative position,scale and spatial coordinates of adjacent points,and then extracts features through multi-layer perceptron,which can be regarded as a type of hard encoding.Furthermore,the knearest neighbor algorithm is used to search the k-nearest neighbor of the center point,and the search result is affected highly by the density of the point cloud.Therefore,this paper uses PointSIFT ordered operator to encode the point cloud features softly to eliminate the influence of uneven sampling caused by uneven point cloud density.In addtion,the ordered operator convolutes the point cloud data along three directions,and stacks multiple modules,so that it can not only encode the direction and position information of the point cloud,but also be aware of scale,which provides a good feature foundation for the subsequent point cloud data feature enhancement.Secondly,his paper uses a simplified attention mechanism to enhance the extracted point cloud features.Original RandLA-Net uses attentive pooling to enhance the features extracted by local spatial information encoding.However,it will learn an attention score for each feature,and its calculation process is relatively complex.Therefore,this paper uses the modified attention mechanism,Attentive SE Block,to learn an attention score for each feature.In addition,in order to get the optimal parameter setting,this paper conductes sufficient ablation experiments on the dimension reduction rate of Attentive SE Block and its embedding position in the network.Finally,the segmentation results are post-processed to optimize the final results.Most point cloud semantic segmentation algorithms directly output the classification label of each point through multi-layer perceptrons without additional optimization steps.Therefore,this paper uses the attention-based score refinement module to optimize misclassified points according to the feature information of adjacent points of each point.At last,OA(overall accuracy)reaches 87.9%on the Stanford 3D semantic parsing dataset(S3DIS),which is 0.3%higher than the baseline. |