Font Size: a A A

The Distribution Scheduling And Result Merging Of Distributed Search Engine System

Posted on:2013-11-17Degree:MasterType:Thesis
Country:ChinaCandidate:W Q LuFull Text:PDF
GTID:2248330374474874Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
With the development of the Internet, the network information is lying in explosivegrowth. Many traditional retrieval systems can only give consideration to certain areas of theinformation resource, and cannot cover the whole network because the limitation of funds andequipment. In this case, the distributed information retrieval shows a good solution way, as adistributed architecture, it can effectively use the idle distributed resources for informationretrieval services.Distributed information retrieval mainly refers to the process of getting usefulinformation from a lot of heterogeneous information resources by using distributed computingand mobile agent technology. However, the different information resources are of differentdata store structure and retrieval strategy, distributed information retrieval system alwaysmeets the following problem: how to acquire the resource description and choose resourcesbased on the description for certain queries, that is the problem of schedule strategy; how tomerge the result lists from different resources, that is the problem of result merging.In this paper, a practical distributed search engine system-the next generation Internetdistributed search system named SE6is introduced. And the research work on schedulestrategy and result merging has been done on this system. For schedul strategy, we suggesttwo ways to obtain resource description information, and then select resources base on thedescription and retrieval history; For result merging, we propose similarity principle anddiversified principle, and construct a different merge strategy other than traditional ones.Some experimenst have been done on the system to verify our research. The experimentalresults demonstrate that our schedule strategy and result merging strategy can improve therecall ratio and the precision ratio on the system’s retrieval result, and also maintain thediversity of the retrieval results.
Keywords/Search Tags:information retrieval, distributed, scheduling, result merging
PDF Full Text Request
Related items