| The rapid growth of the World Wide Web demands an efficient and effective web crawler which selectively seeks out pages that are relevant to the desired topic. Such a system is called a focused crawler. Focused crawling often starts from a subset of the web within local context, whose graph representation is called context graph. By generating a context graph of a given page set, the learning algorithm in the crawler can be trained from the knowledge of the linkage information in the graph.; We introduce improved definitions of the context graph and the algorithms for generating the context graphs in focused crawling to suit the resource limitations of personal computers. We talk about the reason to assign growing priority to each node in context graph to guide further crawling and our two goals: to increase the connectivity level of the graph and to make it easy to identify positive samples from the graph. We list several metrics for evaluating the average and variance of the connectivity of the graph and corresponding algorithms to adjust the growing priority. We show that this approach does help the improvement of the quality of the context graph generated with size limitation. Finally we show by example how to use this to manipulate the growing of the context graph and how parameter settings affect the resulted graph. |