Font Size: a A A

Study On Efficient Representation Methods For 3D Point Cloud Semantic Segmentation

Posted on:2024-02-12Degree:DoctorType:Dissertation
Country:ChinaCandidate:G Y ZhuFull Text:PDF
GTID:1528307118978839Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the improvement of technology and the popularity of laser scanners,3-Dimensional(3D)point clouds are being used more frequently in the daily life of people.As a data type that can capture the shape of 3D objects with simple expressions,laser point clouds have laid a solid foundation for 3D scene understanding and analysis by computers in various fields.Therefore,the study of 3D semantic segmentation based on point clouds has great significance to the intelligent construction of society at this stage.3D point cloud semantic segmentation belongs to computer vision tasks,which aim to predict the semantic category of each point in a 3D point cloud scene.Unlike regular two-dimensional images,as a kind of stereoscopic data with prominent geometric features,the non-uniform distribution of point clouds poses a serious challenge to 3D scene understanding.Current point cloud semantic segmentation methods use deep learning to study the global or local features of point clouds.When multi-scale information is considered,they are limited to multiple dimensions of global features or local features,ignoring the combination of information from both.In recent years,some methods compensate for the lack of global and local information by using increasing the scale of point cloud input,but large-scale input leads to high computational costs and increases the computer hardware requirements for model deployment.In addition,these fully-supervised point cloud semantic segmentation methods rely on massive labeled data,and labeling point cloud data with complex geometric structures is a time-consuming and labor-intensive task.Some methods use weakly-supervised or self-supervised learning to cope with the difficulty of point cloud labeling.But these methods are effective only if the labeled and unlabeled data belong to the same closed label space.In practice,it is a natural phenomenon that new semantic categories other than the training data appear.Although some methods apply few-shot learning to improve the generalization of point cloud semantic segmentation models to new semantic categories,these methods ignore the feature differences between labeled and unlabeled data,i.e.,information gap.In summary,the existing point cloud semantic segmentation methods have the following problems: 1)Single-scale modeling,ignoring the combination of global and local features;2)The input scale is large and the calculation cost is high;3)There is an information gap between labeled and unlabeled data.Therefore,this paper focuses on the above 3D point cloud semantic segmentation problems,and tries to balance the contradiction between model performance and complexity.Firstly,the feature learning capability of the model is improved by designing an efficient feature learning backbone;then,the semantic segmentation method for few-shot point clouds is studied for the problem of difficult large-scale point cloud labeling.The innovative results achieved in the paper are as follows.(1)To solve the single-scale problem of existing point cloud modeling methods,a point cloud representation method based on adaptive fusion of global and local features is proposed,which improves the performance of the point cloud semantic segmentation method.Compared with regular 2D images,point clouds are disordered data types with significant geometric features,so the global and local feature spaces of point clouds contain different information such as geometric structures.This paper proposes a semantic segmentation method for point clouds based on a double-branch architecture.First,the global feature extraction branch and the local feature extraction branch are used to learn the multi-scale global and local features of the point cloud,respectively.Then,the weights are assigned to the output features of the two branches,and the fusion ratio of global and local features is obtained adaptively during the model training.Finally,the attention mechanism is used to guide the model to learn important features in the point cloud and reduce the dependence on the number of point cloud inputs.Experimental results on Model Net40,Scan Object NN and S3 DIS demonstrate that the proposed method outperforms the state-of-the-art point cloud semantic segmentation methods internationally.(2)To solve the high computational cost of existing methods which require largescale point cloud input,a point cloud representation method based on cyclic selfattention is proposed,which can guarantee performance while reducing the computational complexity of the model.Traditional point cloud semantic segmentation methods based on deep learning rely on large-scale point cloud inputs to obtain sufficient point cloud information for better performance.However,large-scale input point clouds have demanding requirements on computer hardware and increasingly expensive computational costs.This paper proposes a point cloud semantic segmentation method based on self-attention.As a set operator with permutation invariance,self-attention can be naturally applied to disordered point cloud data.In addition,the inherent geometric properties of the point cloud are learned by adding a position encoding module to the self-attention module to encode the position relationship between points.Finally,global and local features of the point cloud are adaptively fused in multiple high-dimensional feature spaces to further complement the feature information of the point cloud.Experimental results on Model Net40,Scan Object NN,ScanNet Part and S3 DIS datasets demonstrate that the self-attention module can successfully replace the traditional convolutional neural network model in the semantic segmentation task of point clouds and has lower computational cost.It can achieve a better balance between the performance and computational complexity of the model.(3)To solve the problem of information gap between labeled and unlabeled data,a point cloud representation method based on bias rectification is proposed to reduce the feature differences between the two data types.Point clouds are non-uniform distributed 3D data,and semantic annotation of point clouds is a time-consuming and labor-intensive task.Existing methods apply few-shot learning to reduce the dependence of point cloud semantic segmentation methods on labeled data while improving the generalization ability of the model to new semantic categories.However,there is an information gap between labeled and unlabeled data,and a small amount of labeled data does not cover the complete feature information.This paper proposes an information bias rectification method for few-shot point cloud semantic segmentation.First,the feature representation of labeled data is improved by cross-learning between labeled and unlabeled data.Second,the information gap between labeled and unlabeled data is narrowed by capturing the bias between them.Finally,the semantic category of unlabeled data is predicted by calculating their similarity with the prototype representation generated from labeled data.The experimental results on two challenging point cloud semantic segmentation datasets demonstrate that the proposed method can effectively narrow the information gap between labeled and unlabeled data under few-shot conditions.(4)To solve the problem of information gap between labeled and unlabeled data,a representation method based on co-occurrent feature mining is proposed.The performance of the model is further improved by solving the problem of insufficient utilization of unlabeled data in the 3rd method.Because of information gaps,it is not valid to forcibly migrate knowledge of labeled data to unlabeled data.Therefore,this paper proposes a few-shot learning method based on co-occurrent feature mining.After refining the feature representation of labeled data using the attention module,the cooccurrence feature mining module is used to capture the object features that co-occur in both labeled and unlabeled data to reduce the feature difference between the two types of data.The experimental results on S3 DIS and ScanNet can demonstrate that the proposed method can further improve the information gap between labeled and unlabeled data under few-shot conditions.
Keywords/Search Tags:point cloud semantic segmentation, multi-scale information fusion, few-shot learning, information gap narrowing
PDF Full Text Request
Related items