Font Size: a A A

Research Of P2P Document Query Based On Semantic Similarity

Posted on:2009-02-21Degree:MasterType:Thesis
Country:ChinaCandidate:Y WangFull Text:PDF
GTID:2178360272977166Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Peer-to-Peer network is recently a research hotspot because of its good self-organization,strong fault-tolerant capability and good scalability,which is one of the key technologies about Internet in the future. Distributed search has became the hotspot of P2P network with the growing prevalence of P2P application .The key technology to resolve is how to find the satisfying document quickly and accurately in the large-scale P2P system. But most of the researches currently simply consider the distance between the nodes or documents, neglecting the semantics effect on the query result, however ,the method based on DHT only supports precise query. In conclusion,there are mainly two disadvantages of the current methods: (1) the irrelevant content returned by documents will make bad effect on the query precision,(2) due to the improper index, many relevant documents can not be queried thus influence search recalling rate.The thesis introduced semantic similarity to distributed search for less study about semantics in the P2P network,which is aimed to return the result comparing with the semantic similarity. To be different from most methods of current distributed searches,the thesis use a model combining with both structured index and unstructured routing. To compute the semantic similarity ,the thesis made improvement to the representation of the document,using simple concept list to represent documents ,which index documents with given category, keywords and their frequency in the document. The semantic similarity is computed in WordNet.The search model uses social network for reference,which want to find the needed information through the social relation. Every node creates a LRIT(Local Resource Indexed Table) and a NRIT (Neighbor Resource Indexed Table) to index the resource of its own and its neighbors'respectively ,which is not simply maintain the resource as other methods. In addition ,the thesis created a query-history mechanism to reduce the redundancy,which can memorize the query result to facilitate queries later.Model analysis and simulation prove that, the semantic-based search mechanism is more efficient than the traditional search method in Gnutella-like system.The new method in the thesis improved the search recalling rate,shortened the search length and decreased the total numbers of the messages in the network.
Keywords/Search Tags:Peer-to-Peer network, Document query, Distributed search, Simple concept list, Semantic similarity
PDF Full Text Request
Related items