Font Size: a A A

The Optimization And Achieve For Focused Crawling Algorithm Based On The Website Content Framework

Posted on:2013-02-27Degree:MasterType:Thesis
Country:ChinaCandidate:C X DongFull Text:PDF
GTID:2218330371478376Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Along with the rapid development of the internet, internet information to the speed of the fast growth, search engine has become one of the indispensable way to the people to obtain information in the life, in the modern search engine, the crawler occupies a very important role, it is the core of the search engine, through the crawler obtain sufficient quantity and quality of the web page, search engine just can be based on the index for the user to provide technology based on keywords search service. However, along with the explosive growth of the web, with the same theme website increasingly, how fast the capture of web pages and more accurate analysis of web information and will take effective integration strategy to crawler system come to dominate a web crawler's system cores, also the search engine is the main problems. This thesis from the site of the theme, start with the point of view of the analysis for the website operation framework, extract the accord with the website the theme of the nature of the frame, and according to the behavior of the framework for the current page to determine the application by analysis on the model of knowledge,with the concept of software engineering and design and realize the framework based on themes of the topic crawler algorithm implemented and the optimization, finally, the experimental results of the standard measure, and prove the validity of the optimization.
Keywords/Search Tags:link analysis, focused crawler, page model, hypertext classification, inforination retrieval, vertical search engine
PDF Full Text Request
Related items