Font Size: a A A

The Design And Implementation Of Distributed Education Network Information Retrieval System

Posted on:2011-05-29Degree:MasterType:Thesis
Country:ChinaCandidate:H LiFull Text:PDF
GTID:2178360308964334Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
With the rapid development of next generation networks and the maturity of Web2.0 and other next-generation information technology, information resources have became more and more distributed. The design of search engine architecture raises more new challenges. Search engines such as Google, Yahoo or Baidu mainly solve news, web pages and other general information queries, whose retrieval architecture is still centralized. The advantage of the distributed search engine is that it can combine search engines unit with various features. A reasonable architecture can combine hundreds of search engines and the data collection of IPv6 network and social networks, to improve the system's coverage.This article aims to set up a CERNET oriented distributed information retrieval systems based on numbers of clusters, and make the systems more structured and diversified, to provide a unified search service.This article has designed and implemented a distributed search system. The system composes of several unit search engines (Worker), multiple query proxy nodes (Querier), and a main node (Broker). Worker is a separate search engine. Broker is composed of network layer, logic layer and application layer. Querier is also divided into the network layer, logic layer and application layer. The network layer is responsible for the interactive communication with the Broker including receiveing the Workers updated status and sending heartbeat status information. Abstract adapter is responsible for integration of heterogeneous resources and the conversion of interfaces. Three-tier architecture improves the scalability, fault tolerance and throughput of the system. Based on Web Services and RMI technology, the system has integrated heterogeneous platforms search engines and designed the algorithm to select a proxy node to respond to user queries to achieve load balancing.This article has also carried out the system performance test on the following aspects: scalability, system throughput, the amount of request data, the cost of distributed architecture, the efficiency of communication protocol, and has discussed the composition of the extra cost from the distributed architecture.
Keywords/Search Tags:search engine, distributed system, information retrieval, Web Services, RMI
PDF Full Text Request
Related items