Font Size: a A A

Research On Query Based Clustering In P2P Overlay Networks

Posted on:2011-07-23Degree:MasterType:Thesis
Country:ChinaCandidate:Q MaFull Text:PDF
GTID:2178360308962555Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet, search engine becomes a frequently-used tool for people. However, with the rapid growth of Internet information and people's demand, traditional search engine is becoming increasingly unsatisfactory. How to provide high efficient and high quality search service becomes a hot topic now.P2P searching is proposed in the background of wide application of P2P technology. Structured P2P topology is appropriate for P2P searching for its high efficiency in routing and short response time in searching. But in DHT based P2P networks, similar documents are distributed randomly among peers with their data identifiers consistently hashed, which poses challenge on complex queries. To implement high efficient complex searching, one solution is text clustering. Currently, common text clustering methods are based on document content, in which global document information is needed. Such clustering method is difficult to be applied in P2P environment. Moreover, the clusters generated are not definite and users' searching demand is not considered, which show the lack of flexibility.QBC algorithm is proposed and query based clustering method is implemented in P2P networks directed by historical query set. QBC algorithm consists of pull mode and push mode. Pull mode is directed by historical query set and pull mode requests are sent to other peers to pull documents actively to form clusters. Push mode is directed by Vector Space Model and documents improper for clustering on peers are redistributed to reasonable peers to form clusters on remote peers.To illustrate the effectiveness of QBC, multiple keywords query simulation is done to make comparisons between Inverted List Intersection method and QBC algorithm on number of peers visited, documents in search results as well as network traffic. Experiment results show that QBC algorithm reduces the number of peers visited during multiple keywords query and decreases query response time and network traffic as well with a minor cost of recall rate.
Keywords/Search Tags:P2P, CLUSTERING, PULL MODE, PUSH MODE, VECTOR SPACE MODEL
PDF Full Text Request
Related items