3D Model Retrieval Based On Multi-View And Attention

Posted on:2022-06-26

Degree:Master

Type:Thesis

Country:China

Candidate:D D Zhou

Full Text:PDF

GTID:2558307154476154

Subject:Information and Communication Engineering

Abstract/Summary:

PDF Full Text Request

With the explosive growth of the number of 3D models,3D model retrieval has become a hot research direction in the fields of information retrieval and computer vision.With the rapid development of deep learning,profound changes have taken place in the field of 3D model retrieval.Thanks to mature two-dimensional image processing technology,the three-dimensional model algorithm based on multi-view has outstanding performance.However,the current methods cannot effectively capture the correlation information between multiple views,or the saliency information of the views is lost in the process of multi-view fusion,which reduces the representation ability of the global features of the 3D model.This thesis proposes a multi-view based local information fusion algorithm.Inspired by self-attention and spatial attention,a local correlation attention module(LCAM)is proprosed,which captures the local saliency information in a single view and reduces the negative impact of repetitive information on the fusion of multiple views.At the same time,LCAM has better interpretability,which enriches the theory of attention mechanism.The hierarchical network structure proposed gradually integrates the local feature information of multiple views.First,the periodic shuffling(PS)operator is used to integrate the local information of the view into a view super matrix(VSM).Then,based on the hypothesis of the hypergraph of the three-dimensional model,multiple VSMs are regarded as the multiple channels in hypergraph,and multiple VSMs are integrated into a shape super matrix(SSM)by a PS operation.Finally,the information richness of the global features is improved,and the representation ability of the global features of the three-dimensional model is improved.This thesis also proposes a multi-view fusion algorithm based on multi-level attention.Inspired by non-local attention,a cross-object asymmetric attention(COAA)is proposed,which captures the correlation information between objects and applied in the cross-view attention module(CVAM)and cross-layer attention module(CLAM).First,in the horizontal level,a CVAM is proposed to fuse paired views to capture the correlation information between the paired view features;after the initial fusion,it is considered that the global feature fusion obtained through local splicing is insufficient,The CLAM is proposed to further integrate multi-view information in the vertical dimension,and use deep semantic features to guide the further fusion of shallow features,improve the representation ability of global feature vectors,and improve the accuracy of 3D model retrieval.In order to verify the effectiveness of the proposed method,this thesis conducted a large number of experiments on ModelNet40/10 and ShapeNetCore55,including comparison experiments with mainstream methods,network ablation experiments,and visualization experiments in various dimensions,which proved the algorithm of this thesis is effective,robust and interpretable.

Keywords/Search Tags:

3D model retrieval, Feature Fusion, Cross-object asymmetric attention, Local correlation attention, Deep learning

PDF Full Text Request

Related items

1	Research On Object Tracking Algorithm Based On Correlation Filtering And Deep Learning
2	Research On Image And Text Retrieval Based On Attention Mechanism
3	Cross-modal Retrieval Based On Transfer Learning And Attention Mechanisms
4	Attention-aware Deep Cross-modal Hashing
5	Research On Deep Learning Object Detection Technology Based On Multi-Scale Feature Fusion
6	Research On Object Detection Method Based On Deep Learning
7	Research On Small Object Detection Algorithm Based On Feature Fusion And Attention Residual Network
8	Research On Cross-modal Hashing Retrieval Based On Deep Feature Learning
9	Study On Attention-aware Prototype Learning Joint Correlation Alignment For Cross-modal Retrieval
10	Attention-based Fusion Triplet Hashing For Cross-modal Retrieval