Font Size: a A A

The Theory And Application Research Of Information Retrieval In P2P Systems

Posted on:2012-02-17Degree:MasterType:Thesis
Country:ChinaCandidate:L GuoFull Text:PDF
GTID:2178330332990572Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The theory research of information retrieval in P2P systems is a hot issue in academic circles currently. With the development of internet technology and the rich amount of network resources, efficient sharing of these resources based on P2P technology provides not only a useful supplement to traditional search engines, but also an effective solution for realization of large distributed search. Several factors affecting the realization of information retrieval in P2P systems are analyzed in this thesis, which are network topology, analysis of network performance, resource sorting algorithm and system implementation. On this basis, we select the four key issues to study. During the research, much work has been done. For network topology, a network model construction algorithm based on multi-subject is presented, which classifies resources according to the subjects, is able to quickly collect the information to the network resources, can be used as the network structure of the information retrieval system; for the analysis of network performance, a network platform using PeerSim simulator is built, with which the performance of multi-subject network has analyzed from the aspects of node utilization, number of nodes, self-recovery capability and so on, the results can provide reference methods for the performance of information retrieval systems; for the resource sorting algorithm, a kind of sorting algorithm based on node load is proposed, which can solve the problem of declining transmission quality caused by the dynamic characteristic of nodes; for system implementation, a file-sharing system based on topic division in campus network is designed and implemented, which can be a part of applications of the theory of information retrieval.The main achievements of this thesis can be summarized as follows:Firstly, a network model construction algorithm based on multi-subject was proposed. For the problem of strong node autonomy and poor global information in peer to peer network, an effective algorithm to collect information of resources was proposed. The method divided network resources into several themes according to the resource types, resources with the same subject would be clustered together by periodically discovery algorithm, the nodes with the same subject were called community, and according to capacity ability a number of super nodes were formed, thus, forming a hierarchical network model. As the community gathered the same subject of resource information in the network, when a specific query reached, the relatively optimum results would be got once the super node redirect the query to the corresponding community.Secondly, the performance of the network model with PeerSim simulator was analyzed. As peer to peer networks often have large scale network node, it is difficult to set up the real network environment, use relevant network simulator to simulate is one of important study methods. The experiment compared multi-subject with single-subject and no subject structure; the results showed that the network model can not only collect approximate global node information, but also has fast convergence speed and robustnessAfter that, a kind of top-k sort algorithm based on node load was proposed. The nodes of P2P networks are highly dynamic, and there is no centralized control mechanism, what's more, the network condition is changing constantly, the quality of the data transmission between nodes cannot be guaranteed. So the sorting algorithm based on node load was researched. The node load and network condition was predicted and took to the sorting algorithm of the results. Experimental results showed that the sort algorithm with node lode in consideration can rank the resources with high node performance in front, and ensured high transmission quality of service.Finally, a file-sharing system based on topic division in campus network was designed and implemented. For the problem of information retrieval and resource sharing in campus network, a solution with multi-subject was designed. The scheme took the characteristics of campus internet that the resources in campus are distributed by according to subjects and departments in consideration divided the resources into several subjects, so the search behavior of users could be limited in a subnet with related subjects. This system used P2P architecture to organize and manage the resources, had the functions of resource retrieval, sharing and management. This paper showed the system architecture, the internal structure of node and the flow of work, and then introduced the general design of the system; finally, a prototype system was designed and implemented with Java language.
Keywords/Search Tags:P2P, Information Retrieval, Resource Locate, Network Model, Subject Divided
PDF Full Text Request
Related items