Font Size: a A A

Research On Hierarchical Clustering Index Method Based On Deep Hash

Posted on:2020-08-17Degree:MasterType:Thesis
Country:ChinaCandidate:T T HuangFull Text:PDF
GTID:2438330626953284Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The huge amount of information data is exploding with the rapid development of network and multimedia technology.And information retrieval has become one of the hot research directions at home and abroad.Effective indexing method is the core to improve the performance of large-scale data retrieval.Researchers have done a lot of research on the indexing method,but the retrieval accuracy,retrieval speed and retrieval results sorting and other aspects of the technology is still not mature,more in-depth research is needed.Starting from the above technical directions,this thesis has carried out a series of research,and the main innovation work is as follows:(1)In terms of the issue of slow query speed and large storage space caused by high-dimensional features,the binary hash code features learned from existing hashing algorithms can solve this problem,but the retrieval accuracy is greatly reduced.This thesis proposes a hashing algorithm based on deep learning.This algorithm designs an end-to-end deep hash learning network structure,which can extract features and learn hash functions in the same framework,the two parts can feedback each other to promote learning.In training,we combine the supervisory information of single sample label and paired sample label,which not only preserves the semantic information of the sample itself,but also each sample pair can be clearly identified.Simultaneously,the output hash code is uniformly distributed to improve the information capacity of the hash code transmission.In order to enhance the adaptability of hash codes,the learning of rotation invariance is added to the optimization process of loss function.(2)To suppress the problem of long searching time and poor returning results in traditional methods such as violent searching in large binary feature databases,this thesis proposes a binary hierarchical clustering index method,which includes the construction of hierarchical clustering index,the retrieval of hierarchical clustering index and the reordering of retrieval results.When constructing index,a dynamic Integration of Partion and Hierarchical Decomposition algorithm is proposed.The clustering center is selected according to a dynamic distance threshold to balance the partition of data sets and improve the clustering effect.When querying on hierarchical clustering index,the optimal subclass traversal method is used.Finally,the retrieved candidate set samples are reordered,and two algorithms for calculating the bit weights of hash codes are proposed to calculate the weighted hamming distance between the retrieved samples and candidate set samples,which can return more refined reranking results and improve the retrieval accuracy.In this thesis,the existing research methods are improved and innovated for mass data indexing.The proposed method performs well in multiple public data sets and achieves better results than other existing classical algorithms.Experimental results show that the method proposed in this thesis can realize efficient and accurate search of data,which is of great significance to the development of information retrieval and provides more convenient services for the needs of information-based social life.
Keywords/Search Tags:deep learning, hashing algorithm, hierarchical clustering, index method, reranking
PDF Full Text Request
Related items