Font Size: a A A

Search Engine Sorting Algorithms Based On The Relation Degree Of The Word

Posted on:2013-06-25Degree:MasterType:Thesis
Country:ChinaCandidate:S X GuanFull Text:PDF
GTID:2248330371987459Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The main task of the search engine is to collect network information, and return the related pages to the user.It is easy for crawling enough pages on the network with search engine along with the expansion of the volume and the increase of information. But it is difficult to send them to users with appropriate ranking algorithms.At present, the search engine ranking algorithms are mainly based on link structure, such as the PageRank algorithm and HITS algorithm, and some improved algorithms which combined with other algorithms based on them, practice shows that the improved algorithms work well. However, the ranking algorithms based on link have their own shortcomings, for example, the analysis capacity of natural language is insufficient, that are deprived from the understanding of the language partly. Therefore, the ranking algorithms based on word relations are proposed in this paper. Firstly, we get the related words of keywords and record their relation degrees by analyzing the current rate, and add up the spacing of words and their gains in the document set. Secondly, we take the relation degrees to the PageRank algorithm to impact the ranking of results after getting the keywords from user.We obtain the test documents from Google, and rank the documents again by the above algorithm, then take comparison it with the result of Google. According to the results of experiment, we come to a conclusion that the proposed algorithm can improve the existing problems based on the ranking algorithm based on link. But it also has some disadvantages:Firstly, the corpus is single, and the experimental range is too small; Secondly, the efficiency of the search algorithm is considered incompletely. The proposed algorithm can also be improved based on the analysis of more experiment.
Keywords/Search Tags:search engine, sorting algorithm, PageRank, the relation degree ofthe word
PDF Full Text Request
Related items