Font Size: a A A

Hashing Algorithms For Multimedia Nearest Neighbor Search

Posted on:2017-05-30Degree:DoctorType:Dissertation
Country:ChinaCandidate:D WangFull Text:PDF
GTID:1368330542992981Subject:Intelligent information processing
Abstract/Summary:PDF Full Text Request
With the rapid development of digital multimedia technology,mobile Internet,Internet of things,and cloud storage,tremendous amounts of multimedia data have been accumulated on the web,the world has entered the era of multimedia big data.The enormous multimedia data contains rich economic value and social value,has brought new challenges and opportunities for the national economic and social development.As the amount of multimedia data is growing fast,how to efficiently store,manage and analyze these data,has become a highly concerned problem for researchers and engineers.Hash-based nearest neighbor search technology is an effective way to manage and analyze the large-scale multimedia data for its remarkable efficiency gains and storage reductions.Existing hash-based approximate nearest neighbor search is effective on handling unimodal data.However,in practical applications,multimedia data(image,video,audio,text,etc.)usually has unique properties such as mass data volume,multi-modalities,and semantic interconnection.Considering these characteristics,it is necessary to further study efficient hashing methods for cross-modal semantic similarity search.For this purpose,this dissertation is dedicated to developing new hashing methods for approximate nearest neighbor search on large-scale multimedia data.Targeting both unimodal data and multimodal data,it systematically studies the hashing problem from three levels,semantic similarity search within the same modal data,cross-modal feature similarity search,and cross-modal semantic similarity search.The main contributions are summarized as follows.1.A semi-supervised constraints preserving hashing method is proposed for semantic similarity search within the same modal data.Existing semi-supervised unimodal hashing methods often preserve semantic similarities for low-dimensional embeddings,when converting them into binary codes,the quantization error will be accumulated thus result in performance deterioration.To this end,we propose a novel semi-supervised hashing method which preserves pairwise constraints for both low-dimensional embeddings and binary codes.It first represents data points by cluster centers to preserve data neighborhood structure and reduce the dimensionality.Then the constraint information is fully used to embed the derived data representations into a discriminative low-dimensional space by maximizing discriminant Hamming distance and data variance.After that,optimal binary codes are obtained by further preserving the semantic similarities in the process of quantizing the low-dimensional embeddings.Thorough experiments on standard databases show the superior performance of the proposed method.2.A semantic topic multimodal hashing method is proposed for cross-modal feature similarity search.Most existing unsupervised multimodal hashing methods embedded heterogeneous data into a common low-dimensional Hamming space,then rounded the continuous embeddings to obtain the binary codes.Yet they often neglect the inherent discrete nature of hashing for relaxing the discrete constraints,which will cause degraded retrieval performance especially for long codes.For this purpose,a novel semantic topic multimodal hashing is developed.It first learns latent topics from texts and images fast and efficient,and then generates hash codes directly by figuring out whether a topic is contained in a text or an image.Therefore,the quantization error caused by relaxation strategies is avoided.Experimental results demonstrate that the proposed method has higher search accuracy and short training time than several state-of-the-art methods.3.A label consistent matrix factorization hashing method is proposed for cross-modal semantic similarity search.Most existing supervised multimodal hashing methods are mainly designed for preserving pairwise similarities.When semantic labels of training data are given,they often transform labels into pairwise similarities which will consume enormous storage space and large amount of calculation,and therefore make these methods unscalable to large-scale data sets.Furthermore,transforming labels into pairwise similarities loses the category information of training data.Therefore,the consistency between hash codes of the same category will be affected.To address these challenges,we propose label consistent matrix factorization hashing,which focuses on directly utilizing semantic labels to guide the hash learning procedure.Considering that relevant data from different modalities have semantic correlations,it transforms heterogeneous data into latent semantic spaces,in which multimodal data from the same category shares the same representation.Therefore,hash codes quantified by the obtained representations,are consistent with the semantic labels of the original data,and thus can have more discriminative power for cross-modal similarity search task.Thorough experiments on standard databases show that the proposed method outperforms several state-of-the-art methods on cross-modal similarity search task.4.A multimodal discriminative binary embedding method is proposed to further improve the accuracy of cross-modal semantic similarity search.The overwhelming majority of supervised multimodal hashing methods ignore the discriminative property in hash learning process,which results in hash codes from different classes undistinguished,and therefore reduce the accuracy of the nearest neighbor search.To this end,we propose multimodal discriminative binary embedding to learn discriminative hash codes.First,it formulates hash function learning in terms of classification,where the binary codes generated by the learned hash functions are expected to be discriminative.And then it exploits label information to discover the shared structures inside heterogeneous data.Finally,the learned structures are preserved for hash codes to produce similar binary codes in the same class.Hence,the proposed method can preserve both discriminability and similarity for hash codes,and will enhance retrieval accuracy.Thorough experiments on benchmark data sets demonstrate that the proposed method achieves excellent accuracy and competitive computational efficiency compared with state-of-the-art methods on large scale cross-modal similarity search.In summary,this dissertation proposes four novel hashing algorithms to improve the accuracy of hash-based nearest neighbor search from the perspective of high-level semantic similarity search within the same modal data,cross-modal feature similarity search,and cross-modal semantic similarity search.Theoretical analyses and thorough experiments show the superior performance of the proposed methods over the existing methods.
Keywords/Search Tags:hashing, multimedia, supervised information, nearest neighbor search, crossmodal retrieval
PDF Full Text Request
Related items