Font Size: a A A

Design And Implementation Of Meta-search Engine System Based On Distributed Architecture

Posted on:2014-01-08Degree:MasterType:Thesis
Country:ChinaCandidate:L DongFull Text:PDF
GTID:2268330401977619Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the wide application of the rapid development of computer technology and Internet technology, mankind has entered the era of information explosion. Diversified emerging in the wealth of information on the Internet and information structure makes it very difficult in the absence of other tools to help the situation quickly find the information they want. Internet users China Internet Information Center CNNIC released in2012clearly put forward, with the rapid development of Internet, the Internet data continued explosive growth model, the annual growth rate has exceeded10times. Although the user access to information on the Internet channel shows the trend of the diversification, the main channel but portals, search engine, blog, micro-blog, forums, social networking sites is still the user access to information.Meta-search engine is a product of the information retrieval in the Web retrieval technology development, because it can set multiple members of the search results, and through the corresponding algorithm to optimize the ranking of search results, so it returns the results can greatly improve the recall and precision of retrieval results.Widely welcomed by users of the Internet. However, with the increase in the number of member search engine, return the results of more, the retrieval efficiency and finally the document sorting problem has become the bottleneck of the development of search engine now element.The development of distributed systems, aimed at a single host on the United Network will be a complex task is decomposed into a number of small-scale, low complexity sub-tasks. Through the use of a large number of low-cost machine processing network of smaller, low complexity subtasks, the serial operating mode can be changed to work in parallel mode, such improvements can greatly improve the efficiency of the users to retrieve.This paper first describes the development status at home and abroad in recent years, meta-search engine system, leads to the main research contents on this basis.Meta-search engine search results Sort fusion deficiencies, articles on existing location-based sorting algorithm has been improved. The location of the document information into a document score, increasing the number of factors to consider members of the search engine. The same time, the URL of the document by constructing a domain name cache table to calculate the score of the document URL. Members of the search engine weight, least squares estimation of the parameters in the multiple linear regression method to calculate the weight of each member of the system. Finally, according to the linear combination of algorithm model calculated the total score of the document as a sort.Stand-alone retrieval system has been far from satisfying the requirements of the users to retrieve real-time, distributed systems in the main from the structure of the model are introduced into the meta-search engine. In order to improve the efficiency of the system, the article transfer protocol HTTP/1.1-based communication protocol as each module the download node CPU load ratio as the distribution module distribution strategy. All returned documents retrieval module in the system’s central fusion algorithm based on the sort ratings. The system test, selected five general search engine is currently more popular as a member of the search engine, search engine and network data mining conference in query test set as test data, the test on the average precision results returned. In the actual network environment test results show that, the system has been greatly increased in precision, the development and application foreground.
Keywords/Search Tags:Information retrieval, Search engine, Meta-search engine, Distributed system, Rank merging
PDF Full Text Request
Related items