Font Size: a A A

Topic Model Based User Modeling

Posted on:2014-02-01Degree:DoctorType:Dissertation
Country:ChinaCandidate:W F LiFull Text:PDF
GTID:1228330401963113Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
User modeling is a method to model user interest by analyzing user data. With the development of Internet, information overload becomes a more and more serious problem. And user modeling can give a solution to provide tar-geted information service to meet different user’s personalized service.This paper focuses on topic model based user modeling, which increas-ingly model more and more user information in four steps:Firstly, just model-ing user text in a semi-supervised way. Secondly, jointly considering user and user characteristics to construct user characteristics topic model. Then, based on user characteristics topic model, introducing user tag on one hand, and in-troducing user attributes on the other hand. The main contribution of this paper includes:Proposed semi-supervised LDA with topic feature to analyze user text. In this way, the model can efficiently use supervised information when model-ing user text. Experiments shown introducing topic feature as supervised infor-mation yields better performance without increasing computation complexity. On user browsing data, the model’s degree of concentration-divergence can reach1.902, which improves about1.9%comparing with best LDA. Besides this, this model can also label the documents automatically.Proposed a method to capture user characteristics in a topic model frame. In this model, the word is both influenced by its topic and the user characteristics of its relevant user. And the user is only influenced by his/her user characteristics. Experiments shown that, a combination of user charac-teristics can provide better performance on normal topic modeling tasks. On CiteSeerX dataset, the model’s perpleixity of text can reach1118.37, which im-proves about12.2%comparing with best LDA. And this model can also give attractive results on user characteristics. Proposed user characteristics Tag-LDA which considers both text and user tag when modeling. In the model, text and tags share the same topic distribution of document, and each tag is also influenced by user characteristic which generates its relevant user. The experiments shown that, this model has better performance than Tag-LDA on text modeling task. On del.icio.us dataset, the model’s perplexity of text and tag can reach2736.02and70.62respectively, which improves about1.4%and18.8%comparing with best Tag-LDA. And we can also find that user characteristics often manifest as user interest and wording preference on social tagging system.Proposed user characteristics topic model with user attributes. In the model, user characteristics distribution is decided by user attributes which act as user’s external manifestation. From experimental results, this model shows better performance on text modeling task. On CiteSeerX dataset, the model’s perplexity of text can reach1217.16, which improves about11.2%compar-ing with best user characteristics topic model. And the model can also give attractive results on user interest with using user attributes to slice user charac-teristics.
Keywords/Search Tags:user modeling, topic model, topic feature, user charac-teristics, user attributes
PDF Full Text Request
Related items