Font Size: a A A

Single Image Based 3D Shape Retrieval

Posted on:2021-04-07Degree:MasterType:Thesis
Country:ChinaCandidate:Q F ZouFull Text:PDF
GTID:2428330605980082Subject:Computational Mathematics
Abstract/Summary:PDF Full Text Request
Given a query in some data form,cross-modal retrieval is to retrieve similar sam-ples in another data domain.For example,users often enter some keywords to search related web pages,images,or videos via search engines such as Google.We focus on image-based 3D shape retrieval,i.e.,retrieving 3D models which are similar to the shape of the object contained in the given single image query.Existing related works usually focus on retrieving objects belonging to the same semantic category as the im-age,regardless of whether the specific shape is similar or not,and we pay attention to the similarity of the specific shape during the retrieval process.We propose a novel and effective joint embedding method based on 3D hybrid representation to retrieve 3D objects that are similar in shape to the object contained in the query image.Our method consists of two stages.In the first stage,we pre-train a 3D feature space based on hybrid representations-octree and multi-view through a joint embed-ding.This pre-trained space is only related to the geometry of the 3D models,and will not be affected by interference factors such as background,color,texture,etc.in the real image.It shows obvious advantages when the training data of the real image-3D model pair is insufficient.In the second stage,in order to bridge the semantic gap between real images and 3D models,we introduce a 3D feature transform layer and an image encoder to map both shape codes and real images into a common space via a second joint em-bedding.During the testing process,each 3D model will be represented as an octree and multi-view images,and then we feed them into the encoder in the first stage and the transform layer in the second stage to obtain 3D features.And we can get query image features through the image encoder in the second stage.KNN search is used to retrieve 3D models which are similar to the image query in our joint embedding feature space.Our pre-training stage benefits from the hybrid representation of 3D models and builds a more discriminative 3D shape space than using any of 3D representations only,which is more conducive to the joint embedding in the second stage.We conducted retrieval experiments on several typical datasets-ObjectNet3D,Pascal3D+,and a manually calibrated chair evaluation set and the retrieval results show that our method outperforms all the other state-of-the-art methods.In addition,due to the lower training time cost,our method is easier to experiment on large datasets.
Keywords/Search Tags:Image-based 3D shape retrieval, joint embedding, 3D shape feature representation, cross-modal deep learning
PDF Full Text Request
Related items