Font Size: a A A

Design And Implementation Of A Web Crawler System Supported AJAX

Posted on:2010-11-28Degree:MasterType:Thesis
Country:ChinaCandidate:W H CengFull Text:PDF
GTID:2178360302959660Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
Jesse James Garrett put forward the concept of AJAX based on the JavaScipt and xml in 2005. The AJAX technologies improved technologies of web site development, increased user's interactive experience, and the client does not need to install any plug-ins. As a result, the AJAX technology has aroused widespread concerns in the area of the internet. However, because the current web crawler technologies cannot identify trigger order in the course of URL resolution of the AJAX framework, it led to the result that a lot of data are not able to be effectively searched by search engines.In this thesis, to address the issue, by studying the object-based program slicing algorithm, we computed and implementated the slice with the script administer engine to rebuild the page DOM tree after analyzing the web code on the page. We can crawl on links in the page according to the web page DOM tree after the script implementation, and finally resolved the key technology question of the URL extraction and asynchronous JavaScript web crawler system in the framework of AJAX to fetch the URL.We have achieved the purpose of crawling the AJAX site, designed the system of AJAX web site crawler in this thesis. In the theoretical and technical methods, We summed up and put forward the methods of extraction the URL information which the AJAX frame web site related, the orderly implementation of the code sections, as well as the interoperability between the program slicing module, reptiles module, script execution modules, and provided a new solution to the AJAX framework web sites crawler, designed and implemented the system of AJAX web site crawler.
Keywords/Search Tags:AJAX, Web Crawler, Asynchronous Interaction, Script Analysis
PDF Full Text Request
Related items