Font Size: a A A

Topic Model Based On MixtureLDA Model In Microblogging Services

Posted on:2016-01-07Degree:MasterType:Thesis
Country:ChinaCandidate:Y WangFull Text:PDF
GTID:2308330473465486Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Since Twitter and Face Book has become popular, microblog becomes a indispensable social activity among people’s social activity. People can access microblog platform through the PC or wireless and so on. They send a message on mircoblog platform to express their feeling on the things happened among their life circle. Microblog contents only contains 140 words, which supports sending mixed picture and text among the contents. Microblog is a semi-structured data, which contains structured user information and unstructured text information. How to deal with semi-structured data and get useful information from the microblog data is a hot spot among the field of data mining.Topic is a summary of user’s microblog content, which can reflect the interests of user. Mining the topic of user’s microblog and then making personalized recommendation to people. In this paper, we propose a mixture LDA model to mining the topic distribution of microblog’s content. Through the mixture LDA model we can get topic distribution about user microblog at different types.In addition, we use perplexity to evaluate the topic models. And then we do some experiments to compare mixture LDA with other models.The experimental results shows that the mixture LDA model can get lower perplexity and less iterative times than the other two models. So mixture LDA model is better than the other two models. At the end, we apply the results into personalized recommendation.This article mainly contributions is as follows:1) Through analyzing sina microblogs’ content characteristic, we put forwards a mixture LDA model to mining the topic of microblog effectively.2) In order to makes the model result can be more closer to the user’s real interest tendency, we assign different parameters for different microblogging characteristics among the mixture LDA model;3) We get the dataset from the sina microblog platform,and get a good results on real dataset. Through the model we can get the user’s microblog topic distribution and hot topics of people’s interest, and the results can be applied to personalized recommendation based on user’s topic.
Keywords/Search Tags:Sina Microblog, Topic Model, Microblog types, mixture LDA, SNS
PDF Full Text Request
Related items