Font Size: a A A

Research On Visual-Semantic Cross-Modal Retrieval Based Hashing Learning

Posted on:2019-10-06Degree:MasterType:Thesis
Country:ChinaCandidate:X W LiFull Text:PDF
GTID:2428330566996029Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
Approximate Nearest Neighbor(ANN)search plays a fundamental role in machine learning,information retrieval and other related applications,and hashing is a typical representative in the ANN research community.The goal of hashing is to map the data points from the original space into a Hamming space of binary codes.By using binary hash codes to represent the original data indirectly,the storage cost can be dramatically reduced.Furthermore,by using hash codes to construct an index,which allows us to achieve a constant or sub-linear time complexity for search.In many real applications,the data is usually consisted of multi-modalities like image,text,voice and video.Especially in recent years,with the rapid development of the Internet,multi-media,multi-mode data show explosive growth,the classical information retrieval methods are not satisfied with the demands of people,therefore,Cross-Modal Retrieval has recently attracted much attention,in particular the research on Cross-modal Hashing.Based on the in-depth study of hashing learning and cross-modal retrieval,three improved hashing learning algorithms on visual-semantic cross-modal retrieval are proposed:1.A novel Cross-modal hashing algorithm,which is referred to Similar-Preserving Cross-Modal Hashing(SPCMH),is proposed.SPCMH is extended from the unimodal image information retrieval method based on hashing learning to multi-mode model,which turns information retrieval into classification problems.To improve the robustness and generalization ability of the learned hash code,independent and balanced correlation constraints are also taken into account.2.A novel Cross-modal hashing algorithm based on kernel method,which is termed as Anchor Graph Cross-Modal Hashing(AGCMH),is proposed.Generally,the kernel method first transforms the original data into a non-linear way,which makes it possible for the data of the original space not to be linearly divided.To make use of the advantage of kernel method,AGCMH is presented.Specifically,a number of anchor points are selected from the training samples according to the clustering algorithm,then a nonlinear transformation for the selected anchor points is operated by the RBF kernel function.Through corresponding hash parameter matrix obtained by the objective function,the experiments on Img2 Txt and Txt2 Img can be carried out.3.A novel discrete cross-modal hash retrieval algorithm is proposed,which is called discrete cross-modal hashing based matrix factorization(MFDCMH).Specifically,Hash learning is essentially a discrete constraint problem,and it is also a NP-hard.Generally speaking,the solution of discrete constrained problem is solved through the relaxation to derive the suboptimal solution,then obtain hash parameter matrix by binary quantization,however,there exits uncontrollable error in the quantification process.MFDCMH,which is considered to make use of discrete cyclic coordinate descent method to learn the binary hash matrix directly.Extensive experiments are carried out in order to verify the validity of the proposed algorithm,in the single-label datasets Wiki,Pascal VOC 2007 and multi-label datasets NUS_WIDE,MIRFLICKR-25 K respectively,and the experimental results show that the feasibility and effectiveness of the algorithms proposed in this paper.
Keywords/Search Tags:approximate nearest neighbor, hashing learning, cross-modal retrieval, anchor graph, discrete cyclic coordinate descent
PDF Full Text Request
Related items