Font Size: a A A

The Research Of Personalized Search Algorithm Based On User Interest Model

Posted on:2014-02-13Degree:MasterType:Thesis
Country:ChinaCandidate:Y XiaoFull Text:PDF
GTID:2248330395991756Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid growth in the amount of information on the Internet,people develop a search engine to search the information related to themselves.It is a major milestone in the development of query resource developmentprocess. However, with the increasing demands of the users, traditional searchengines have not been able to meet their needs. Its shortcomings, such as lowprecision, duplicated web pages, are coming to the fore. In order to meet theusers’ needs better, personality, intelligence has become the development trendof the search engine. In this paper, we make a more in-depth research in thepersonalized search engine.First of all, through the research of the existing user interest model, thispaper proposes a new users interest construction algorithm. In the method,SVD and k-means clustering algorithm are used several times in differentgranularities to create two weighted interest trees: a document class tree and aword class tree. Each node in the tree is weighted and the weight represents thedegree of interest of the user for that type of document or word. The validity isproved by the experiments.Secondly, this paper puts forward an improved method in order to solvethe shortcomings of the vector space model. That is using SVD to reduce thedimensions. The document-words category matrix generated by the SVD cansolve the shortcomings of the vector space model’s large dimensions, thesacristy of the matrix as well as the phenomenon of synonyms andpolysemants. The experimental results show that the vector space modelproposed in this paper has higher performance than the traditional vector spacemodel.Finally, this paper puts forward a new sorting algorithm in order to solvethe shortcomings of the existing search engine ranking algorithm on the basisof user interest model presented in this paper. That is using Bayesian classification algorithm and a scoring algorithm to calculate the score of pagesreceived by the traditional search engine. And then the scored pages are rankedin descending order. The experimental results show that the personalizedsorting algorithm proposed in this paper has higher accuracy than thepersonalized search algorithm based on probability model under the sameconditions. It can meet the users’ needs better.
Keywords/Search Tags:User profile, Singular value decomposition, K-means clusteringalgorithm, Vector space model, Naive Bayesian classifier, Personalized ranking
PDF Full Text Request
Related items