Font Size: a A A

Study And Application Of Dynamic Refresh Strategy For Network Information

Posted on:2012-04-22Degree:MasterType:Thesis
Country:ChinaCandidate:H YangFull Text:PDF
GTID:2178330332494869Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Due to the rapid development of Internet, the quantity of web page sharply augments, to bring the acquisition hardware a great challenge, and the dense gripping of World Wide Web Crawler in search engine baffles the access of the common user through the browser. How can the network information be effectively applied? Acquisition strategy is crucial. It is essential to adjust the acquisition frequency according to the fresh frequency. Not only the acquisition is provided with the pertinence, but stress on the acquisition hardware is reduced due to concentrative acquisition at a certain moment.This paper introduced the fundamental principle of Web crawler information acquisition and the characteristic of incremental Web crawling technology. Change of web pages was investigated. Adaptable algorithm for acquisition period, in the aspect of pertinent industry data acquisition, is presented. Aimed at the subjectivity for threshold of the current fresh frequency, this paper proposed the quartile. According the data of the latest N records, by the quartile, the threshold was automatically gained.Therefore, this paper employed the following strategy. Based on the incremental information acquisition technology, the evolution of web page within a collecting period is obtained, compared with the gained threshold through the quartile. When the fresh frequent of web pages exceeds the threshold, the acquisition period is adjusted to realize the dynamic acquisition of network information. Furthermore, this paper presented the dynamic acquisition algorithm that was raised by the fellow of the research group for the comparison study.Experimental results show that the presented dynamic acquisition algorithm is feasible, and manifest a certain referential meaning, with the different focus from the fellow's algorithm. This study is conducive to effective utilization of the information resource in World Wide Web, and reduces the request for the acquisition hardware.
Keywords/Search Tags:Information acquisition, Dynamic acquisition, Webpage evolution, Quartile
PDF Full Text Request
Related items