Font Size: a A A

Research On Content Search Engine Based On The Topic Relevance Routing In P2P Networks

Posted on:2007-04-08Degree:MasterType:Thesis
Country:ChinaCandidate:C Y XuFull Text:PDF
GTID:2178360212968548Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Along with the rapid development of Internet technology and exponential progressional increase of network resource, the tempo of search engine decides the using rate of Internet resource to a large extent, and the problems in existing search engines make the academia ceaselessly do some research in every kind of search theory and technique. As a brand-new search method, P2P content search engine gets increasing attention of the academia, and all kinds of search algorithms are brought forward and applied.The research results available about search engine at home and aboard is analysed, and the content searching and P2P technique is expatiated in the article. The advantages and disadvantages of P2P content retrieval technique is expounded, then the problem on present P2P content search engine is pointed out, that increase of the Recall Rate would result in the distinct increase of searching time and sharp decrease of network efficiency. Aimed at this problem, a algorithm on searching request routing based on topic relevance represented by k-High frequency term (RTRkFT) is put forward, which have been applied in Yooyoo search engine prototype,and validated the feasibility and validity of this algorithm by experiments.The main contents of the article are concluded as follows:(1) The content search and P2P technique is expatiated, which are the bases of P2P content search engines as well as the pre-requisite background knowledge of this article.(2) JXTA and Lucene are analysed. JXTA is the open platform to exploit P2P system, and Lucene is a content search tool kits. The analysis of JXTA and Lucene become the base of exploiting Yooyoo search engine prototype.(3) Aimed at the problem that making topic by clustering algorithm cannot go for P2P condition, a method is brought forward, which uses k-High Frequency Terms Vector to denote the topic of document aggregates, whose advantages such as low calculational quantity, good retractility and easy calculation of topic comparability go for appliance in P2P condition(4) Aimed at the problem that the increase of the Recall Rate would result in the distinct increase of searching time and decrease of network efficiency, RTRkFT is put forward which decides the routing direction by computing the relevance between retrieval result topic and node topic, to ensure that retrieval request could first reach...
Keywords/Search Tags:P2P, Search Engine, Context Retrieval, Topic Relevance, Topic Search Engine
PDF Full Text Request
Related items