| With the rapid development of the Internet and the accumulation of a largenumber of documents, how to obtain the documents efficiently has become one of theurgent things to address problem. For today’s massive high-dimensional data of thedocument, the traditional indexing and retrieval technology can not meet the needsthat users can quickly find what they want, so as opposed to focusing on an integratedsearch result of document retrieval, focus on the speed of the hash method for rapiddocument retrieval technology.Fast similarity search technology is a technology for large-scale document dataretrieval efficiently at the expense of retrieval accuracy so that the retrieval speed isgreatly improved characteristics show a good value in the massive document retrievalapplication. By solving a high dimensional space is mapped to the low-dimensionalspace using manifold method, which reduces the dimensional of the documents, anduse efficient hash technology to accelerate the retrieval of match process, enablingusers to quickly locate the documents that they want to find.Consider that semantic hashing method consume a large number of computingresources and not use the information between documents in the process of indexingfor fast similarity search techniques, this paper follows the idea of semantic hash,combined with spectrum of hash index method and the Markov network to strengthenthe relationship between documents to obtain better high-dimensional feature downthe dimensional of embedded subspace, and pruning to reduce the time complexityand space complexity in the indexing of technology, allowing to quickly and effectiveimplementation of indexing and retrieval of high-dimension feature. |