Research On Key Technologies Of Sparsification For 3D Data Target Recognition

Posted on:2024-07-29

Degree:Master

Type:Thesis

Country:China

Candidate:H Y Ling

Full Text:PDF

GTID:2568306944470744

Subject:Communication engineering

Abstract/Summary:

With the rapid development of deep learning,3D data is widely used in the fields of autonomous driving and robotics,and the task of target recognition based on 3D data has become the focus of current research.The target recognition methods based on 3D data are mainly divided into point cloud-based methods,voxel-based methods and multi-view-based methods.Among them,multi-view-based methods have the advantages of easy data acquisition and high recognition accuracy,so they have received much attention and research.The 3D target recognition method based on multi-view data has achieved good research results,but the existing methods still have certain shortcomings,and there is still room for improvement in recognition accuracy and model operation efficiency.In view of the above problems,this paper thoroughly researches the key technologies of multi-view image recognition,graph convolution,graph pooling,etc.,and proposes the multi-view image recognition algorithm based on view association,and then proposes the optimization algorithm of multi-view image recognition based on graph convolution for the existing multi-view recognition methods that do not fully utilize the view feature information.On this basis,the graph pooling method under multi-view condition is proposed for the existing multi-view-based graph pooling methods that do not fully utilize the graph structure information.1.To address the problem that it is difficult to effectively utilize the potential correlation information between multiple views,this thesis proposes a view association based multi-view image recognition network model(VRANet).In this paper,the feature extraction network is optimized,and the view association search module is used to find the association features of the local features of the views,and the view association coding module is used to encode the relative information of viewpoint positions between views into the association features,and then all the association features are aggregated using the attention mechanism.The proposed model is tested on the ModelNet40 dataset under 20 input views,and the experimental results show that the classification instance accuracy of VRANet is improved by 1.2%compared to the MVCNN model.2.To address the problem of low scope of graph convolution methods in multi-view scenarios,this thesis proposes a graph convolution based multi-view image recognition optimization network model(MVGCN).In this paper,we first construct a multi-view-based graph structure,define the relationship between nodes in the graph by the adjacency matrix initialization method,and use the graph convolution module for view feature extraction,so as to realize the efficient utilization of view features by graph convolution.In this thesis,the proposed network model is tested and validated on ModelNet40 dataset,and the experimental results show that the number of parameters of MVGCN is reduced by 59.3%compared with the original method,while the classification instance accuracy is still maintained at 97.6%.3.To address the problem of low efficiency of graph pooling methods in multi-view scenarios,this thesis proposes a graph pooling based multiview image recognition network model(View-GFN).In this paper,we use the assignment matrix to obtain the view features and adjacency matrix of the pooled graph,and realize the multi-view-based graph pooling.This thesis uses graph convolution to generate the assignment matrix,which enables the assignment matrix to obtain rich graph structure information.This thesis then uses the fused graph convolution module to simultaneously implement view feature extraction and assignment matrix generation to improve the operational efficiency of the model.The proposed model is tested against View-GCN on the ModelNet40 dataset,and the experimental results show that the classification instance accuracy of View-GFN reaches 97.8%,which is 0.2%better than that of View-GCN,and the number of model parameters is reduced by 49.9%compared with that of View-GCN.On this basis,for the problem of more redundant information among views,this thesis proposes a multi-view sparsification method.In this thesis,we use the above model to obtain the target recognition accuracy under different view number conditions,and analyze the recognition accuracy and data usage efficiency to give multi-view sparsification results.The proposed method is experimented on public datasets such as ModelNet10,and the optimal multi-view sparsification results are obtained for a specific dataset.

Keywords/Search Tags:

target recognition, multi-view images, deep learning, graph convolution, graph pooling

Related items

1	Activity Recognition Based On Ubiquitous Computing
2	Research On Multi-View Deep Graph Clustering
3	Study Of The Multi-view-based Graph Representation Learning For Recommendation
4	Research And Implementation On Anchor Graph Based Multi-View Learning Algorithm
5	Gait Recognition Research Based On Multi-Stream Feature Fusion Of Skeleton And Graph Convolution
6	Research On Pooling Algorithm Based On Graph Neural Networ
7	Research On Point Cloud Classification Based On Graph Convolution
8	Research And Application Of Face Attribute Recognition Based On Deep Learning
9	Multi-label Image Recognition Based On Deep Graph Convolution
10	Multi-view Feature Learning Based On Skeleton And Image Data And Its Application In Behavior Recognition