Font Size: a A A

Research And Implementation Of Core Technology Of Distributed Search Engine

Posted on:2016-07-12Degree:MasterType:Thesis
Country:ChinaCandidate:P C BaiFull Text:PDF
GTID:2308330470979777Subject:Computer technology
Abstract/Summary:
With the coming of the era of big data, people have accumulated the massive amounts of data in many aspects of daily production, life and work. At the same time, the amount of data is still growing rapidly in irreversible way every day. Thus this poses a serious problem of information overload. Traditional centralized search engine, which is limited by the storage conditions and computing speed, has been unable to adapt to the fast query of huge amounts of data. Nevertheless, Distributed search engine by adopting the idea of divide and conquer, which is based on a large amount of ordinary PC, not only can provide solutions for large data storage, but also can provide people with fast accurate query results. Distributed search engine involves many complicated problems and difficulties. In order to further study and master the hidden mystery behind distributed search engine, this paper constructs a set of map query oriented distributed search engine, with it expanding the study of distributed search engine core technology. First of all, we design reasonable distributed index structure on the basis of the map data, and implement the dynamic update of the index, supplemented by clever mechanism of index compression. Secondly, we provide reasonable solution for map data storage based on the geographical position. At the same time, each independent search unit of distributed search engines, adopts effective retrieval model to ensure the respective accuracy. Then, the master server makes use of query distribution strategy and data fusion strategy providing people with ideal query results. Finally, according to the submitted queries of people, we analyze potential query intention, and determine the category of the query intention correctly. In this way, we can put the information people need in the front to improve the rankings of distributed search engine. From the experiment results, reasonable distributed index is the basis of the fast query of search engine, and distributed search of collaborative work of a large number of machines can shorten the time of the query effectively. Moreover, according to the size of the correlation, a good retrieval model can return orderly query results, while the query intent analysis can provide the best user experience.
Keywords/Search Tags:Distributed Index, Distributed Search, Query Intention
Related items