Font Size: a A A

Analysis Of User Viewed Content And Mining Of User Interest

Posted on:2005-03-30Degree:MasterType:Thesis
Country:ChinaCandidate:Y C ZhaoFull Text:PDF
GTID:2168360125963819Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the development of Internet and Information Technology, the problem of "Information Explosion" has arisen , that is, "Rich Data and Poor Information". How to manage the tremendous mount of information on WWW to meet the growing needs of personalize information is a new subject for our research. Personalization has been the focus of research. Personalization, that is give different service-strategy and different service-content to different user. Knowledge of user interests and describe them by user profiles are the importance. Therefore, the effectiveness of personalize information service provided by the system is determined by the fact whether or not the user profiles reflect user interests exactly.After the studying of the key technologies――web mining technology and modeling user profile, the author suggests the model of mining user interests. The model is based on user viewed content and combining with analysis of user's behavior. Through analyzing document expressive model,feature extraction and feature weigh value, the web page is been expressed by Vector Space Model.In the paper, the author do hard in two aspect: clustering based on content , creating the user interest model. After the probing into the cluster algorithm existing and the applied practice, the author proposed a new cluster algorithm: combining agglomerative algorithm with K-means algorithm. In the process of cluster, user the agglomerative algorithm get the cluster-means and k firstly, and then use K-means algorithm to do the second cluster. After get the cluster, the author use tree-model to express the user interest, it is as (,),(,),…,(,)). For the sake of using and updating of user interest model, every interest style of user is express by VSM as the web page. So, the compare of web page with user interest style can be valued by similarity function. When calculating the interest style weight, the author consider three factors: (1)the large mount of web page; (2)In web page muster, the content page is more than assistant page; (3)the value of GILD(group inter-link degree) must be little.Finally, the author experiment on the advanced method discussed above. According to the experimentation and analyses, prove that the new cluster algorithm and tree format interest model are reliable, and can be applied in personalization system.Lastly, the deep work of this paper is that developing the validity of user interest model, and applying it into the recommendation.
Keywords/Search Tags:Content Clustering, User Profile, Web Pages, Vector Space Model, Personalization
PDF Full Text Request
Related items