Resource-bounded online search for dense neighbourhood on the Web

Posted on:2003-03-24

Degree:M.Sc

Type:Thesis

University:Dalhousie University (Canada)

Candidate:Liu, Jinghu

Full Text:PDF

GTID:2468390011487986

Subject:Computer Science

Abstract/Summary:

PDF Full Text Request

The rapid growth of the World Wide Web demands an efficient and effective web crawler which selectively seeks out pages that are relevant to the desired topic. Such a system is called a focused crawler. Focused crawling often starts from a subset of the web within local context, whose graph representation is called context graph. By generating a context graph of a given page set, the learning algorithm in the crawler can be trained from the knowledge of the linkage information in the graph.; We introduce improved definitions of the context graph and the algorithms for generating the context graphs in focused crawling to suit the resource limitations of personal computers. We talk about the reason to assign growing priority to each node in context graph to guide further crawling and our two goals: to increase the connectivity level of the graph and to make it easy to identify positive samples from the graph. We list several metrics for evaluating the average and variance of the connectivity of the graph and corresponding algorithms to adjust the growing priority. We show that this approach does help the improvement of the quality of the context graph generated with size limitation. Finally we show by example how to use this to manipulate the growing of the context graph and how parameter settings affect the resulted graph.

Keywords/Search Tags:

Context graph, Web

PDF Full Text Request

Related items

1	Research On Context-Sensitive Graph Grammars And Their Applications
2	A Research On Multiple Context Based Scene Graph Generation
3	Research On Context-Aware Computing Technology
4	Research On The Technique Of Context-sensitive Pointer Analysis
5	Research And Implementation Of Recommendation System Based On Two Graph Network Structure Andfusion Of Context Aware Information
6	Representation Learning Of Knowledge Graph Method With Context Information
7	Research And Design Of Public Opinion Warning System Based On Conceptual Graph
8	Context-free graph grammar induction using the minimum description length principle
9	Research On Video Image Rain Removal Based On Temporal And Spatial Context Information
10	Research And Practice Of The Context-aware Indoor Routing