Research Of High-dimensional Distributed Indexing Based On Locality-Sensitive Hashing

With the development of the Internet, the quantity of multimedia data including image,audio and video is increasing faster and faster. How to help users to retrieve multimediadata quickly has become a major challenge that limits the usability of search enginetechnology. Although many research institutions and commercial companies have releasedcontent-based image search engines, but there is still a large space for improvement.Specially, the problem of high memory consumption and computational overhead ofhigh-dimensional indexing in content-based image search engine has always been theresearch focus.To solve above problems, the Locality-Sensitive Hashing (LSH) index and Hadoopdistributed system can be combined to improve the index architecture and computationalmodel. The system reform the architecture of Locality-Sensitive Hashing (LSH) index to aloosely coupled structure, which can be deployed in the distributed query nodes.Meanwhile, the MapReduce distributed computational model is used in index creatingprocess to improve the efficiency of high-dimensional index creation. Besides, thedistributed database is used to store large amounts of high-dimensional index data, whichenhances the system’s scalability obviously. Furthermore, the querying module utilizes aconcurrency index-query cluster which will meet the tremendous requirements of users’queries, and the high dimensional index is stored in the memory to improve queryingspeed.The experiment results show that the introduction of distributed computing anddistributed storage enhances the performance of index creation. At the same time, theloosely coupled index structure reduces the synchronous communication overhead andmakes the high concurrency and fast index searching becoming reality.
Keywords/Search Tags:Locality-Sensitive Hashing, distributed indexing, content-based image search
