Font Size: a A A

Research Of Weibo Topic Detection Model Based On Dirichlet Regression

Posted on:2021-05-12Degree:MasterType:Thesis
Country:ChinaCandidate:Z W DuFull Text:PDF
GTID:2428330632453244Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapid development and popularity of the Internet in China,the cost of information transfer has been reduced to the lowest than ever.In today's Internet era,the rate of information growth is much more rapid than ever before,and the convenience of the Internet will make information to be disseminated at will.Therefore,all over the world,governments oversee and supervise network public opinion.As a representative of social network applications,Sina Weibo has played an important role in public opinion communication.Weibo,the short text limited,often leads to poor effect of topic discovery by applying of the topic model,because the text is too sparse.In addition,the social attributes of Weibo,it is full of marketing advertisement and entertainment information.On the contrary,the public opinion which need to be supervised by government departments is relatively small.Therefore,it is great significance to find public opinion in Weibo and to know the development trend of public opinion in time.Combined with the data characteristics of public opinion events,and the advantages of traditional LDA(Latent Dirichlet Allocation),this thesis proposes a probabilistic topic model based on Bayesian parameters to find the matching topics in Weibo data.The main work of the subject is as follows:(1)Introduced the basic principles of the probabilistic topic model and the basic mathematical theory,and analyzed the relationship between Weibo data and news data in public opinion events in detail.(2)According to the characteristics of Weibo and news data,a multivariate dirichlet multinomial regression is designed.At the same time,the topic allocation is learned as a priori knowledge in the news data,and then applied to the Weibo data to solve the Semantic sparse features.(3)This thesis introduces the algorithm process and parameter inference process of the model in detail,and designs the experimental evaluation method from the qualitative and quantitative,and tests it with real case data,and compares and evaluates the mining ability and clustering ability with the traditional LDA method.The method proposed in this thesis combines the Weibo data with the characteristics of the news data,which makes the topic discovery ability of the short text data set have a certain improvement.At the same time,the thesis also gives a practical application scheme.In the real case,the model can be used to detect and track the occurrence and development of public opinion events.The practical application of this topic model can achieve better detection and supervision of public opinion events in Weibo.
Keywords/Search Tags:Short Text, Topic Model, Public Opinion, Dirichlet Distribution
PDF Full Text Request
Related items