Font Size: a A A

Web Authority Sort Classification Algorithm Based On The Analysis Of Link Credibility

Posted on:2013-09-02Degree:MasterType:Thesis
Country:ChinaCandidate:H ZhaoFull Text:PDF
GTID:2248330362465886Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the popularity of the Internet, the number of web pages has grownexponentially, and it is greatly difficult to get information through the existing searchengines. First of all, the search results with the search engine contain mixed themes,which are not classified according to the themes and make the users more difficultlyto get the topic type information. Secondly, the quality of search results is uneven(containing junk pages, junk advertisings and so on), which make the users difficultlyto filter the high-quality information. Aiming at these problems, this article makessome work as follows.First, in order to solve the mixed subjects of the pages returning from the searchengine, this article will make web pages with category identifiers. Then the users canchoose their categories to search, which is faster and more accurate to locate thedesired information.Secondly, in order to increase the accuracy of classifying the page text, the paperwill propose a feature weight algorithm basing on feature noise weighting. Thisalgorithm reduces the impact on webpage text classing caused by non-standardfeature noise. The method improves the accuracy and robustness of the page textclassification.Again, to address the problem that the quality of search results is uneven, thepaper will introduce the business reputation in the market economy model to the sortof evaluation on the web authoritative. Through mining the evaluating the credibilityof historical links, the paper adjusts the ordering of the pages with the evaluationmodel combined with the algorithm of PageRank, which improves the quality of thetop search results page and encourages the web producers effectively to take focus oncreating high-quality pages.Finally, this article will build a system model with the thinking, thus which will provethe availability of the ideas.
Keywords/Search Tags:Text Classification, Link analysis, link reputation, the PageRank, CategorySearch
PDF Full Text Request
Related items