Font Size: a A A

Search Engine Results Ranking Based On Web Page Clustering

Posted on:2011-12-05Degree:MasterType:Thesis
Country:ChinaCandidate:S S SunFull Text:PDF
GTID:2178360308990388Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With continuous enrichment of web resources, more users try to query information through search engine. However, the users have also realized the difficulties of obtaining information by information retrieval system, while they experience convenience brought by it. On the one hand, the current search engine returns a large number of results based on shape matching between content and the query. As the query has a wide semantics, there is theme mixed phenomenon in the returned list. Users must select from the results constantly, which costs much time. On the other hand, the search results have no personalization. To solve these issues, this paper proposes search engine results ranking based on web page clustering.Firstly, in order to solve the phenomenon of mixed themes in the search results and help users locate valuable information quickly and accurately, this paper applies text clustering to search results processing and propose search engine results clustering based on the subject phrase. A new feature extraction method is presented. The feature vector is composed of the subject phrases and high-frequency words. In addition, the synonyms dictionary is used to expand semantic of the characteristics items and we apply a modified k-means clustering algorithm to cluster search results and extract category labels for each category.Secondly, to the personalization problem, search results ranking algorithm based on user interests and web page clustering is proposed. Through excavating user interests and establishing interest model, we sort the clustering results according to the user's interest categories, expand the category labels based on user interest model, and adjust page order in the user category of interest.Finally, experiment results show that the search engine results ranking algorithm based on clustering can improve the results'quality and the query efficiency. Search engine results clustering based on the subject phrase increased the clustering precision. Results ranking based on user interest realized the personalization. But there are many shortcomings, which need a further improvement.
Keywords/Search Tags:Search Engine, Text Clustering, Personality Rank, User Interest Model
PDF Full Text Request
Related items