Font Size: a A A

Research On Collaboration And Implementation Of Focused Crawler Based On Multi-Agent System

Posted on:2013-11-07Degree:MasterType:Thesis
Country:ChinaCandidate:Y XuFull Text:PDF
GTID:2248330377453764Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
As the Internet grows exponentially, general search engines are encountering someunprecedented challenges. The results returned by the general search engine contained a largenumber of irrelevant information for user’s query request, which give birth to the focusedcrawler. The focused crawler only crawls the on-topic web pages, and avoids a large numberof off-topic web pages, so it can save much time on web crawling. The advantage of focusedcrawler is that, in one hand it spends less time and smaller storage space in crawling on theweb, in the other hand it can be better to meet the personalized needs of the user. This alsopromotes the development of focused crawler.For a variety of traditional focused crawlers, they are independently working in theircrawling process, and there are no communication and collaboration among focused crawlers,so that they can not be timely communication to realize information sharing, this leads tocrawling overlap and crawling efficiency not high. Using Multi-Agent theory to achievecommunication and collaboration among focused crawlers is a new topic for improving thecrawling precision and efficiency of the focused crawlers. This paper looks each focusedcrawler as an agent, every focused crawler owns features with independent, flexible,interactive and so on, and this paper uses the knowledge of Multi-Agent to implementcollaboration among multiple focused crawlers in the web crawling, so as to improve thecrawling precision and efficiency of the focused crawlers.The main contents of this paper include the following:1. This paper proposes a novel ability measurement method. This measurement methodis used to evaluate whether the agent has ability of calling for proposals, which notonly consider the importance degree of crawling history web pages, but also considersthe web links score, the comparative experiment shows that the ability measurementmethod can evaluate the ability of calling for proposals better.2. This paper proposed a new organizational structure of Multi-Agent for focusedcrawlers. In this new organizational structure, all agents were divided into threecategories, namely F-Agent (Facilitator-Agent), As-Agent (Assistance-Agent) andC-Agent (Crawler-Agent), they worked with their own responsibilities and cooperatedmutually to complete a common task of web crawling.3. This paper introduced the collaboration model of focused crawlers based onMulti-Agent System. An improved contract net protocol was used to achievecollaboration among focused crawlers based on Multi-Agent System, and this paperdiscussed in detail four process of the improved contract net protocol: Task announcement/call for proposals (cfp), Bidding, Contracting and Termination. Inorder to achieve the collaboration model, this paper proposes the system frameworkand work flow of the collaboration model.Finally, we had achieved the collaboration model of focused crawlers based onMulti-Agent by JADE, and used crawling precision and crawling efficiency to compare theperformance of this system with other four kinds of web crawlers, the comparativeexperiment results showed that this system not only can reduce the crawling web overlap, butalso had higher crawling precision and crawling efficiency.
Keywords/Search Tags:Multi-Agent, Focused Crawler, Contract Net Protocol, Collaboration, JADE
PDF Full Text Request
Related items