A Web Crawler Supporting AJAX

Posted on:2008-12-04

Degree:Master

Type:Thesis

Country:China

Candidate:B Luo

Full Text:PDF

GTID:2178360212984907

Subject:Computer applications

Abstract/Summary:

PDF Full Text Request

Web Crawler is an important component of Search Engine, web developers build applications that are easier to use and more functional than traditional Web programs by using AJAX technologies, which create web pages with Asynchronous JavaScript and XML. AJAX changes the content of web pages dynamically after getting the data from web server by sending the request asynchronously. As a result, the data that the traditional web crawler collects is less than the data presenting in the web browser. We propose a new web crawler - AjaxCrawler, which supports AJAX.The AjaxCrawler is composed of crawling web page, analyzing web page, interpreting JavaScript, invoking DOM operation methods, regenerating web page. First, crawl the web page by HTTP request, second, analyze the page element, not only the links, but also the JavaScript code and file in the page, then, execute the JavaScript code, which include the AJAX request, gets the result from server and invoking DOM operation methods to change the content of web page, at last, regenerate the web page and extract the links.According to the experiment, the content crawled by AjaxCrawler is more than traditional crawler at the same condition.

Keywords/Search Tags:

Search Engine, Web Crawler, AJAX, Web2.0

PDF Full Text Request

Related items

1	A Study Of Ajax-oriented Search Engine Techniques
2	Ajax Data Search Engine Clawer Research And Design
3	Research And Implement Of Distributed Crawler System Supporting AJAX
4	Research On Blog Search Engine Based On RSS And Realized By LUCENE
5	The Design And Implementation Of WEB Crawler And Topic Search Engine Based On Nutch
6	Design And Implement Of Information Document Search Engine System Based On JavaEE Platform And Lucene
7	Design And Implementation Of Vertical-Search-Engine-Oriented Spider
8	Customizable Focused Crawler
9	Research And Implementation Of LMS System Based On Ajax
10	Research On The Crawler Of Search Engine