Font Size: a A A

Research On Deep Hashing For Large-Scale Image Retrieval

Posted on:2021-03-18Degree:DoctorType:Dissertation
Country:ChinaCandidate:G ChenFull Text:PDF
GTID:1368330605481202Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet and the increasing popularity of intelligent devices which have the cameras,people can snap images and upload them to the Internet anytime and anywhere,and these images can spread widely and rapidly through social networks,the massive image collections have been improving explosively.It becomes extremely difficult to find relevant images from such an overwhelming amount of images according to users' requests.Approximate Nearest Neighbor(ANN)search is an effective solution to address the aforementioned large-scale image retrieval problem.Hashing methods have been widely applied to ANN search in large-scale image retrieval due to its fast search speed and efficient storage space.Due to the powerful feature learning functionality of deep learning techniques,deep hashing which learns image features as well as hash codes,have become an emerging stream and received extensive attentions.Three typical large-scale image retrieval scenarios,i.e.,simple image retrieval,multi-label image retrieval,image-text cross-modal retrieval,are considered in this thesis,and the problem of deep hashing are intensively studied.The major contributions are listed as follows:(1)To solve the problem of hashing learning for simple image retrieval,a deep hashing method named Deep Supervised Hashing with Nonlinear Projections(DSHNP)is proposed.In existing deep hashing methods,linear perceptron functions are widely adopted as the hashing projection functions,which limits the learning capacity of hashing projection functions and leads to the suboptimal performance of the hash model.To solve the aforementioned problem,a DSHNP method is presented.In particular,soft decision trees are adopted as the nonlinear projection functions.Soft decision trees not only inherit the nonlinear mapping capacity of decision trees,but also make the outputs differentiable such that they can be combined with deep learning architectures whose learning process is performed by back propagation.In addition,to handle the redundancy problem in the hash codes,two novel regularizers are designed,i.e.,parallel and orthogonal regularizers,which are imposed on the parameter matrices of leaf nodes in soft decision trees.It can be proofed by theory that these two regularizers can make the projection directions orthogonal,and achieve the goal of reducing redundancy in hash codes.Extensive evaluations on two benchmark image datasets show that the proposed DSHNP outperforms several state-of-the-art hashing methods.(2)To solve the problem of hashing learning for multi-label image retrieval,a deep hashing framework called Deep Multiple-Instance Ranking based Hashing(DMIRH)is presented.Most existing hashing methods ignore the two characteristics of multi-label image retrieval,i.e.,the multi-label image as input and the ranking list as output,which leads to suboptimal representations for multi-label images and decreased performance of the hash codes.To address the aforementioned problem which can be formalized as the problem of multiple-instance ranking learning,a DMIRH method is proposed.In DMIRH,each image can be represented as a bag of object proposals,i.e.,each image can be represented as a bag of instances.A category-aware bag feature construction module is designed to learn the bag feature vector,which can jointly assign the learned instance features into categories by selecting category-representative instances and aggregate the selected instance features into a bag feature vector based on radial basis function.In order to approximate the Inner Product distance between the aggregated bag feature vectors,the product quantization algorithm is extended to approximate the Inner Product distance instead of the Euclidean distance.It can be proofed by theory that the quantization loss can effectively control the hash quality.Experimental results on public benchmarks show the superiority of DMIRH over several state-of-the-art hashing methods.(3)To solve the problem of hashing learning for cross-modal retrieval,a deep cross-modal hashing by exploiting instance-level correspondences(DCMHIC)is proposed.Most existing cross-modal hashing methods ignore the instance-level correspondences between different modalities,i.e.,the semantically aligned object-phrase pairs in an image-sentence pair,which often results in false positives and decreases the retrieval performance of hash codes.To address the aforementioned problem,a DCMHIC method is designed,which can embed the instance-level correspondences into cross-modal hash codes.In DCMHIC,a graph is constructed for each database point and a representation vector of this graph is learned as the cross-modal embedding of the database point.The instances in each data point are regarded as the nodes of the graph,and each edge is conditioned on the instance-level correspondences between the image and sentence.Thus,the correspondences between instances can be captured in an unordered way and then embedded into cross-modal hash codes.As the query points lack the bimodal information,two modality-specific hashing functions are learned respectively to map query points into binary codes.A constraint is added on query hash functions to guarantee the hash codes of query points lie in the cross-modal hash space of database points.Extensive evaluations on two benchmark datasets show that the proposed DCMHIC method yields substantial boosts over the state-of-the-art methods.
Keywords/Search Tags:large-scale image retrieval, deep hashing, nonlinear projection, multi-instance ranking, instance-level correspondences
PDF Full Text Request
Related items