Font Size: a A A

Research On Extraction And Analysis Of Information Networks And Rank Related Problems In INTERNET

Posted on:2008-06-27Degree:DoctorType:Dissertation
Country:ChinaCandidate:H B LiuFull Text:PDF
GTID:1118360242994060Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Internet breaks the bottleneck caused by the limitation in range and means of communication and resource sharing. Under this circumstance, gap based on the difference of comprehension and the requirement of information resources comes to be the essential obstacle for communication, and this becomes more and more significant.Research on information network focuses on the relationship among different kinds of information units. It can find the intrinsic principles and relationships on Internet, takes the advantage of data process of computers, and provides valuable information for the users. Mining services such as search engines provide efficient way for the resources locating on Internet. The Rank technology can highly impact the quality of mining results. Power-law features were discovered from the degree distribution of the Internet. The heterogeneity of network structure can provide valuable intrinsic information for the Rank related problems, and its effectiveness has been proved by some applications such as search engines.In this thesis, our research work on the extraction and analysis of information network and Rank related problems was introduced. We mainly focus on WWW publishing system and Usenet discussion system. On WWW system, we created Spider program, collected the dataset within campus network with the Spider and extracted the hyperlink network structure on page and site level. Features of the distribution and connection of hyperlink network were studied. To understand the emergence of power law distribution, a network model with preference attachment and intrinsic fitness was created and the simulation results were given. A quick rank algorithm based on web structure named ExpRank was proposed.On Usenet system, Bot program was created and used to collect the dataset from NNTP server. From the dataset, the relationship network of participators and threads and respondent was extracted. Based on the characteristic of Usenet Discussion System, a method to evaluation the IFP and IFT of participators and threads using the relationship network was proposed. The algorithm called PostRank to calculate the rank of postings based on the respondent networks was proposed. Some analysis of the meanings and features our algorithms were discussed, and the experimental results on collected datasets were given. Our work can provide valuable and practical information for the understanding and Rank application of information networks.
Keywords/Search Tags:Information Networks, WWW System, Newsgroup, Discussion System, Power Law, Rank Problem, Link Analysis, Scale Free Networks, Search Engine, Impact Factor
PDF Full Text Request
Related items