Font Size: a A A

Research And Implementation On Optimizing The Focus Spider Arithmetic Based On Grid Technology

Posted on:2008-10-19Degree:MasterType:Thesis
Country:ChinaCandidate:Y L ZhuFull Text:PDF
GTID:2178360215461903Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In recent years, grid technology has been widely used to fulfill authentic sharing of resources and to maneuver uniformly and use large amount of resources in each node. Information grid is built on computing grid, using technologies such as data mining, information infusion, search engine and so on. It facilitates searching and sharing of grid resources and is intended to build a new-generation information platform based on OS and web. On this platform, Information processing is distributed, corporate, and intelligent. Information can be accessed through one entrance.Focus spider collects pages set according to theme of the subject in an intelligent way. The pages set collected are then processed and analyzed using methods like machine learning and information searching. Finally, requests of information searching from users are fulfilled in an efficient way of searching.Basic concepts and current development of grid and focus spider in our country and aboard are discussed. Also, the meaning of research in focus spider algorithm based on grid technology and main work of this paper are fully illustrated.Grid technology and architecture are discussed in detail. Globus architecture, OGSA architecture and OGSA architecture based on Web Services are also illustrated and further basic architecture of open grid services are also mentioned.The concept, architecture and current problems concerning focus spider are analyzed in great depth. A focus spider algorithm-ZTSpider algorithm is brought about. The algorithm settles the problems of deficiency of online-learning of current focus spider algorithms through research of hypertext categorization and hyperlink analysis, and at the same uses increment feed-back mechanisms, and optimizes information delivery between parent node and children node, and further enhanced crawling efficiency.A ZTSpider simulating system is developed and implemented. The system improved architecture of current focus spider. The system is developed using Java programming language and possess features like cross-platform and high extensibility. It also fulfills intelligent web information collecting and validates the availability of ZTSpider algorithm through gain rate of the algorithm.A distributed focus spider system based on information grid are also devised and implemented. The system harnesses SOAP, WSDL, and UDDI to accomplish description services, interface definition, and releasing ZTSpider to Globus Toolkit. It also accomplishes design of OGSA architecture and distribution, association and intelligent work of focus spider.
Keywords/Search Tags:Grid Services, Information Grid, Focus Spider, Hypertext Categorization, Hyperlink Analysis
PDF Full Text Request
Related items