Font Size: a A A

Research And Implementation Of A Web Tracing System

Posted on:2004-09-03Degree:MasterType:Thesis
Country:ChinaCandidate:F LiuFull Text:PDF
GTID:2168360152467692Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The information changing of web pages is a very important property of the current Internet. With the fast changing content on the web, similar to some other web applications, how to trace the update of the specified pages effectively is very important. Tracing system is a kind of information service which focuses on the dynamic changes of web pages. In this thesis, we study the key technologies of web tracing system. The main content of this thesis includes: 1, The Poisson process model is a good mathematical model for the web page changing. Based on this model, the changing frequency of a specific web page is predicted. The tracing frequency will be adjusted according to the changing frequency of the target web page. This can greatly improve the tracking efficiency. Firstly, the Poisson process model for the web page changing is well studied. We make the balance between the catching ratio and the recall ratio. The ASA algorithm is then designed to get the best performance together with an improved QFA method for comparison. An experiment platform is designed to test those different algorithms.2, The web crawler which is used to collect the source web pages is important in the tracking system. We improve the performance of a crawler by two means and make it suitable for a web tracking system. One is to find out the best way to check the web page status the other is to use the multi-thread technology which can improve the efficiency of web page downloading. 3, The efficiency if web changing detection is improved. The function of difference detection is simplified and we focus on the links changing among different versions of a specific web page. A distributed storage system is designed to improve the file accessing. The updating information delivery is also important and a prototype of the information delivery is also proposed in this thesis. We designed our own web tracking system: ChangeSpider. ChangeSpider can monitor the changes of the user specified web pages automatically as well as its ability of self-adjust frequency according to the changing frequency of the target pages. Experiments show that ChangeSpider turns to be useful and have a good performance.
Keywords/Search Tags:Information Changing, Update Frequency, Information Tracing, Information Delivery
PDF Full Text Request
Related items