Font Size: a A A

Research On Co-Ranking Algorithm In Web Pages Based On Random Walk

Posted on:2011-10-16Degree:MasterType:Thesis
Country:ChinaCandidate:S J ZhangFull Text:PDF
GTID:2178330332461461Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of the World Wide Web, information also appears explosive growth, quick access to desired information is becoming increasingly difficult, therefore, the Web page ranking algorithm is becoming increasingly important. The search engine is by far the most effective way in helping people find the information you are looking for. People access search engine not only to find information, but also requires it to carry out accurate query related topics. So how to efficiently obtain the information users required have become a huge challenge the search engine is facing.In recent years links analysis algorithm based on the random walk analysis has got a huge success, but the limitations of these algorithms is only concerned with the same kind of link structure of the information or a single text message. HITS algorithm is based on the topic, but there is no better ability of anti-cheating for HITS. PageRank can effectively prevent cheating, but it is a non-topic sensitive, cannot distinguish a Web page in a broad sense is authoritative or just in the specific query is authoritative. The current Web page ranking accuracy still unable to meet the needs of Internet users. Only through links, fonts, surface characteristics, location, and so cannot really judge the association of search word and the articles.This paper proposes a novel Web pages in the network, in collaboration with the sort of framework (Co-Ranking), algorithm based on the random walk in two different page weight, better simulation of the user's Web browsing behavior. For example, in the ranking HITS algorithm at the forefront of the PageRank algorithm, an entity should get a good ranking. Likewise, for the same page, PageRank ranking sequence on the rank of the HITS algorithm is a sequence. The PageRank algorithm, the user will be selected from the current page to the next step to gain access to, and also to a very small probability from the current page to jump to the Authority with the HITS algorithm to a high value in the page. This algorithm takes account of the importance of the global, and consider the topic local.Experiments show that this algorithm can improve accuracy by 7% compared with other 4 page ranking algorithms.
Keywords/Search Tags:Information Retrieval, Co-Ranking, Random Walk
PDF Full Text Request
Related items