Font Size: a A A

Research On Focused Crawler Technology Of Vertical Search Engine

Posted on:2013-01-19Degree:MasterType:Thesis
Country:ChinaCandidate:L J LiuFull Text:PDF
GTID:2248330377459115Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the rapid development ofInternet, a wide range of information growingexponentially, the users put forward the increasing demands to information retrieval services,especially in the search results on the professionalism and accuracy, and general searchengines retrieval capacity can not meet this demand, vertical search engine came into being. Itis a new service model search, providing the service for a particular profession, people ortopic need. For general search engines, vertical search engine retrieval results have moreprecise, focused, specific and in-depth features.Focused crawler as a core component of vertical search engines, which type of searchstrategy used to retrieve a Web resource implicates to the merits of vertical search enginesdirectly, so the Focused crawler has become a hot area of research in the vertical searchengines in recent years. The article describes the related concepts, principle and keytechnologies of the vertical search engines and focused crawler in detail, a careful study of theexisting classic search engine search strategy, determine the related algorithm to topics andthe distributed features of the page, page on the basis of distribution, proposed acomprehensive value based on theme relevance and importance of page to determine the topicrelevant of the page, and used the adaptive immune algorithm to guide the crawling strategyof focused crawler, achieved a good practical effect. At the same time,the search strategy offocused crawlers based on a single value for the evaluation present lack of subject drift, anintelligent crawling algorithm based on Bloch Quantum Evolutionary Algorithm (QBEA) isproposed, the proposed algorithm integrated web distribution on the Internet fully, using theadvantages of two types of evaluation criteria of the immediate value and the future value,according to focused crawler search on the actual process,adjusted to the proportion of twostandards online in the integrated value,we know the experimental result by simulation, theQBEA obtained a higher recall rate and precision rate and can solve the existing problems,and has a certain self-adaptive.Finally,according to the actual application requirements, the proposed search strategy offocused crawler was used in the actual system and introduced in Oracle SES technology, theresults show that the work to the text is valid,has some Innovation and practical value.
Keywords/Search Tags:Vertical Search Engine, Focusing Crawler, Relevance ComPutation, HeuristicSearch, Bloch Quantum Evolutionary Algorithm
PDF Full Text Request
Related items