Font Size: a A A

Point Cloud Sequence Classification Based On Transformer And PointLSTM

Posted on:2024-02-10Degree:MasterType:Thesis
Country:ChinaCandidate:Y T WeiFull Text:PDF
GTID:2568307151953629Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Visual based pose sequence classification is a hot topic in computer vision research and has important application prospects in the field of human-computer interaction.Previous research mainly focused on classifying images,videos,and skeletal sequences.Compared to skeletons and video images,point cloud sequences provide more flexibility for pose classification in environments with poor visibility,and have more precise geometric dynamic structures than traditional videos and skeletons.However,there are few existing pose classification methods for point cloud sequences.Therefore,this thesis proposes two types of point cloud sequence classification networks,and the specific research content is as follows:(1)The deep neural network based on Transformer is used to realize the classification of 3D point cloud sequence.Firstly,one layer of PointNet++ network is used to extract intra frame features of point cloud sequences,and two layers of improved PointNet++ network are used to extract inter frame features.Then,Transformer network is used to capture the motion information of the entire sequence using self attention for local features;Finally,cross entropy loss function is used to reduce the classification error.This method effectively solves the problem of inter frame point inflow and outflow by performing self attention weighting throughout the entire sequence,allowing inter frame outflow points to be ignored at lower levels of attention.(2)The dual stream neural network based on PointLSTM is used to realize the classification of 3D point cloud sequence.This model is a dual flow network.Firstly,a point cloud network dynamic graph edge convolutional network is used to extract global position information features from the original point cloud;At the same time,fine pose features are extracted through the PointLSTM network;Subsequently,these multi-scale features are concatenated through embedding layers and modeled using LSTM time series;Finally,the cross entropy loss function is used to reduce the classification error,and the final classification result is obtained.This method effectively extracts multi-scale features by using a dual flow network,solving the problem of insufficient multi-scale feature extraction in current single flow models.(3)Comparative and ablation experiments were conducted on the SHREC’17and MSRAction datasets using the two point cloud sequence classification networks proposed in this thesis.Through a large number of experimental results,it is shown that the framework proposed in this article has achieved good results on both real datasets.Compared with the latest point cloud sequence method,the accuracy on the MSRAction dataset has been improved by 0.3% and 0.8%.
Keywords/Search Tags:Point cloud sequence, Action classification, Deep learning, Transformer, Long and short term memory network
PDF Full Text Request
Related items