Font Size: a A A

Research On Microblog Topic Detection And Tracking Based On BTM Model

Posted on:2018-06-29Degree:MasterType:Thesis
Country:ChinaCandidate:Z Z WangFull Text:PDF
GTID:2358330518960448Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Recently,the rapid development of Micro-blog brings people convenience to communicate.Micro-blog has become an important part of social networks.It is characterized by fast spread,real-time interaction.Micro-blog is a service platform to publish and access information,which continuously updates people's network life.Users can browse news,post kinds of something fresh to share with friends,and participate in the discussion of hot topic in leisure hours.The micro-blog presents a large amount of data,short text messages,facing problems of information load and message isotropic.It is hard for users to effectively gain favorite subject,and grasp the subsequent related topics.To study topic detection and tracking,we can not only provide users with interested topic information and follow-up reports,but also effectively guide public opinion,which has a certain practical significance.Micro-blog is shorter,which lacks rich context and has lower word frequency.Therefore,using the traditional model will face severe data sparseness problem in dealing with micro-blog data and reduce the performance of topic model.In this paper,combining with characteristics of micro-blog,this paper proposed micro-blog topic detection and tracking method based on BTM(bi-term topic model).First of all,this paper introduced the BTM topic model to deal with micro-blog data,BTM learns topic based on word co-occurrence,and all the bi-term words constitute corpus to extract topic,which can fully extend term topic.Compared with the LDA model,BTM can effectively improve the data sparse problem.Secondly,we use k-means to study topic detection.After BTM,data set is more focused,and the differences between topics are obvious,K-means clustering algorithm can cluster the topic with degree of differentiation.When combining K-means and hierarchical clustering respectively,and analyze the two results,this paper adopts BTM&K-means clustering algorithm to study micro-blog topic detection.Finally,on the basis of BTM modeling,we make use of topic-word table to complete topic tracking.In the research process,this paper adopts sequence weighted method to improve the similarity calculation method.By comparing the semantic similarity between words,and calculate similarity between topic-word table of micro-blog and BTM to improve semantic problems of characteristic.
Keywords/Search Tags:Micro-blog, Topic Model, Topic Detection, Topic Tracking, Topic-word Table
PDF Full Text Request
Related items