Font Size: a A A

Topic Mining And Prediction From Microblogs Based On Topic Model

Posted on:2019-06-19Degree:MasterType:Thesis
Country:ChinaCandidate:Q JiangFull Text:PDF
GTID:2348330566959018Subject:Software engineering
Abstract/Summary:PDF Full Text Request
As the rapid development of computer Internet information technology,weibo as a kind of new online social platform,following high speed,wide coverage,diversified contents,high interactive and high real-time information,has quickly become an important channel for news media organizations and people to share and exchange some of their's moment of life.For example,Only Sina Weibo,it can get hundreds of millions of new Weibo,and some of containing a lot of current affairs information and continuing to grow.What the key problem of text topic mining is how to effectively integrate fragmented information and recommend information in combination with users' interests from mass weibo information.In order to solve the traditional model of text topic on micro-blog theme mining low accuracy and without considering the contact problem between subjects,based on the traditional LDA model,combined with the four types of micro-blog(@ micro-blog,micro-blog,the type of topic,reply forwarding),analysis the advantages and disadvantages about LDA and HMM model,according to the characteristics of micro-blog Chinese corpus,we proposed the theme of micro-blog,MB-HL(Microblog-Hidden Markov Model&Latent mining model Dirichlet Allocation),which used by micro-blog as a processing unit,a distribution of theme-word matrix and optimized by different LDA models of micro-blog users behavior modeling and feature extraction,using the time sequence modeling ability of HMM model for making up the shortage of LDA in the theme of strong correlation,model structure and parameters are determined using the reasoning of Gibbs sampling.Experiments on real Sina micro-blog data show that the MB-HL model can improve the key words accuracy of nearly 9% and can effectively discover the relationship between subjects.In order to further optimize the performance of the MB-HL model,the analysis of LSTM based on a deep learning model in end-to-end,DMB-HL(Deep MB-HL)based on feature representation of deep learning was proposed,this paper uses LSTM model,feature representation using its network structure can automatically obtain the document semantic level fusion,combined feature representation with probabilistic topic deep learning the network model,excavated more hidden micro-blog themes information through deep learning network.
Keywords/Search Tags:LDA, Gibbs sampling, Microblog, MB-HL model, LSTM model
PDF Full Text Request
Related items