Font Size: a A A

The Research Of 3D Model Retrieval Algorithm Based On Deep Metric Learning

Posted on:2022-08-07Degree:MasterType:Thesis
Country:ChinaCandidate:Y X ShaoFull Text:PDF
GTID:2518306323960649Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of computer technology,three-dimensional vision is more and more appearing in people's lives.At the same time,in academic research,research based on three-dimensional vision is occupying an increasing proportion in the field of computer vision.Whether in practical application or academic research,how to quickly and accurately retrieve the correct model from a large number of 3D models is a key issue that needs to be solved.Therefore,the retrieval task based on three-dimensional models has received more and more attention,and has become an important research direction.There are many ways to represent the 3D model,including multi-view,point cloud and voxel.The works in this paper adopt the representation method based on multiview.Recently,with the rise of deep learning,many multi-view 3D model retrieval algorithms based on deep learning have appeared.However,there are still improvements in mining the relationship between multiple views,and there is the problem of insufficient feature discrimination during retrieval.Therefore,this paper focuses on how to capture the long-range dependencies between views and how to use deep metric learning to improve the discrimination of features.To solve these problems,the main work of this paper includes the following two aspects:(1)Multi-view 3D model retrieval algorithm based on visual transformer(MVTN).Instead of view features,the use of patch features can mine the relationship between views in a more fine-grained manner.The algorithm proposes patch convolution,in which constructs a neighboring graph based on patch features,and define edge features of the graph with patch features and patch coordinates.New patch features containing information about neighboring patches can be obtained by fusing edge features,thereby achieving fine-grained capture of long-range dependencies between views.At the same time,the relationship between patch features in each view is also ignored.The patch-level transformer is used to assign weights to patches in a single view and obtain the view feature by fusing patch features.The view-level transformer is used to mine the relationship between view features and fuse view features.In the training process,extracted features can be more discriminative with deep metric loss functions.Experiments on related datasets show that the algorithm has very superior performance.(2)Patch Convolutional Neural Network for View-based 3D Model Retrieval(PCNN).This algorithm improves the patch convolution operation in MVTN,and proposes improved patch convolution.In the improved patch convolution,artificially defined block coordinates in MVTN are changed to positional encoding which is learned by the network by itself.The way of learning by the network itself can learn the location information of the view patches and the location associations between the view patches better.New patch features can be obtained by adding the positional encoding,which contain not only the semantic information but also the location information of patches.Therefore,when constructing the neighbor graph,not only the semantic similarity but also the similarity of positional encoding is considered,which contributes to exploring the connections between patches.The adaptive view weight layer assigns different weights to the views,which can make full use of the distinguishing information of each view.In order to make full use of the information of model features and view features to supervise the network learning in the training process,the discriminative loss function not only needs to classify model features,but also needs to correctly classify view features,so as to ensure that the correct model features are learned.Experiments on the public datasets show that the algorithm has achieved state-of-art performance in the 3D model retrieval task.
Keywords/Search Tags:3D Model Retrieval, Multi-view, Deep Learning, Patch Convolution
PDF Full Text Request
Related items