Font Size: a A A

The Research Of Multi-Strategies Methods In The Information Crawling

Posted on:2009-11-23Degree:MasterType:Thesis
Country:ChinaCandidate:Q YingFull Text:PDF
GTID:2178360245469991Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
Search engine is widely used in real life. In recent years the scale of internet network became larger, so the core part of the search engine, information crawling, is concerned by more and more people. Because the crawling efficiency of information crawling is closely linked with the configuration of the networks, the key point of the information crawling efficiency is to get familiar with configuration of the networks.In this thesis, we review the recent work of the information crawling domain and network emulation models then focus on the crawling strategy for scale-free networks with large exponents. The contributions of this thesis are as follows:(1) On the basis of some networks characteristics, we do some emulation works and some experimentation, then we use the test results to analyze the structure of the networks. The last we get the conclusion is that the BA network model and the GGSS network model have the character of the scale-free.(2) We take three strategies of the information crawling base the three network models, then we use this strategies to do the emulation test on the network models. After that we get the different test results and data about every one strategy on the different models, but the most important conclusion is the High-degree Seeking crawling strategy have the best effect on the networks with the character of the scale-free.
Keywords/Search Tags:information crawling, scale-free network, the BA network model, the GGSS network model, the High-degree Seeking crawling strategy
PDF Full Text Request
Related items