| Point cloud semantic segmentation plays an important role in autonomous driving environment perception.Point clouds are obtained by scanning and measuring the light travel time of the environment through lidar sensors.Due to the fixed scanning angle and high inter-frame redundancy of traditional mechanical rotary scanning lidar,the prisms-based non-repetitive scanning lidar has received increasing attention due to its unique scanning method and low price.The current point cloud semantic segmentation algorithms are mainly designed for the repetitive scanning lidar and do not take into account the inter-frame information of non-repetitive scanning lidar.To fully utilize the inter-frame information correlation of non-repetitive scanning lidar point clouds and improve the 3D environmental perception performance,this paper studies the non-repetitive scanning lidar point cloud semantic segmentation problem for the first time,proposing a new method based on point cloud temporal fusion and investigating model compression optimization based on knowledge distillation.The main research contents and innovative results of this paper are as follows:Firstly,to support the research of non-repetitive scanning lidar point cloud semantic segmentation algorithms,we built a data acquisition and algorithm validation platform based on the unmanned cart,and constructed the first open source non-repetitive lidar semantic segmentation dataset,which contains 5 scenes with 5107 frames of point clouds and corresponding RGB images,and annotated each frame of point cloud data with 5 categories.Secondly,taking advantage of the highly complementary nature of inter-frame point clouds due to the characteristic of non-repetitive scanning lidar having different scanning angles at different times,a point cloud semantic segmentation network called NRSeg based on temporal fusion is proposed.Based on 3D sparse convolution,we use the Point Transformer to fuse temporal point cloud information: by searching the neighborhood,we find the corresponding neighborhood voxel in the previous frame,learn the corresponding score matrix,and thereby enhance the feature of the current frame voxel.Experimental results on our dataset show that the fusion of inter-frame information from non-repetitive lidar can improve the single-frame point cloud segmentation accuracy,and solve the problem of low accuracy in the segmentation of rare classes.Thirdly,to address the problem of a large number of parameters and high computational complexity in the NRSeg model,we further studied the method of pruning the NRSeg network based on knowledge distillation,and designed a distilled network called tiny-NRSeg for non-repetitive scanning lidar point cloud semantic segmentation.To train the network,in addition to using the true labels as the supervision signal,we also use the prediction probability of the teacher network and the affinity matrix of the intermediate feature layers as the supervision signal.Moreover,the pixel-level loss functions and pairwise loss functions are designed,assigning different weights to different loss terms.Experiments show that our pruned network can reduce 74.83% of the model parameters while maintaining the segmentation accuracy,and shortening the inference time by46.59%. |