| This new field of deep learning has been proposed since 2006,and it has been more than a decade since LeCun successfully used the Convolutional Neural Network(CNN)to identify handwritten digits and then AlexNet successfully classified the images.Tasks also extend from the simplest picture classification and time series analysis to complex tasks such as spatial segmentation and generation of confrontation networks.Similar to native machine learning,deep learning is also divided into supervised learning and unsupervised learning.Supervised learning is still a more important way thanks to the development of big data.For most of the problems,we are able to find a large amount of calibrated data.In the case that the classification detection technology of two-dimensional images is becoming more and more mature,many scholars have turned their attention to the task of three-dimensional objects.The task of three-dimensional graphics classification and semantic segmentation has always been a challenge.The classification task is to obtain the category of the item given the three-dimensional information(rgb-d image,point cloud,voxel or grid)of an item.Semantic segmentation is more difficult.It is to classify each element(pixel point,point cloud point)of this scene in the case of obtaining a specific scene(mostly indoor).At present,the research on classification and segmentation of 3D graphics is the cornerstone of this sub-field,laying the foundation for the next instance segmentation(Instance Segmentation).Although this field is becoming more popular,there are still many problems that have not been resolved.Although the current mainstream data sets(S3DIS,SUNRGBD,ScanNet)have begun to take shape,the data sets(ImageNet,MSCOCO)that compare the two-dimensional image recognition are still small-scale data sets.Not only that,but just moving the mainstream solution in 2D,there are still many problems(slow calculation speed and high storage cost).In this paper,we propose a framework called PointSIFT,which can be loaded between any other framework and is a point-to-point data extraction method.This inspiration comes from the previously very successful SIFT operator,and we hope to improve the performance of the existing 3D point cloud network through an intermediate processing framework,and this processing method is simple and effective.Finally,we provide the results of this framework on the mainstream dataset,and verify the advantages of this approach by comparing it with other methods,and also prove the flexibility of this framework. |