Font Size: a A A

Research On Bursty Topic Detection And Tracking Method Based On Topic Model

Posted on:2020-05-06Degree:MasterType:Thesis
Country:ChinaCandidate:Y CaiFull Text:PDF
GTID:2428330590979159Subject:Engineering
Abstract/Summary:PDF Full Text Request
In recent years,with the rise of Web2.0 social networks,Weibo has become popular with people because of its simple and convenient advantages.It has become an important way for people to publish and receive information,and also provides an important public platform for people to share information.Weibo generates hundreds of millions of text streams all the time,and these massive text streams contain a wealth of potential knowledge.In Weibo,users can browse topics of interest and read and discuss textual content related to the topic.When people are concerned about the topic,the number of Weibos associated with it will appear in a short bursty of time.The bursty topic of Weibo is an emerging network topic that has a strong impact on users and society in a short period of time.Therefore,if we can detect bursty topics in a large amount of Weibo data in time,and understand people's various views and feelings on bursty topics,they can be applied to the control of the Internet public opinion by the government and other relevant departments,and can also help enterprises.Timely development of effective strategies for serving the business sector;and analysis of Weibo users' attention to bursty topics is also beneficial to improve personalized user services.It can be seen that detecting and tracking bursty topics from the Weibo short text stream has important application value.However,Weibo text streams are short text streams,and detecting bursty topics from short text streams is more challenging.To this end,this paper analyzes and studies the methods of Weibo topic detection and tracking from the following three aspects:(1)Taking full advantage of the bursty features in the Weibo text,a Weibo bursty topic detection method based on BTM topic model is proposed.The method captures the Weibo information flow by dynamically sliding the window,and adjusts the size of the time window according to the information flow;further,based on the physical dynamics principle,considering the timing of the Weibo and the social network behavior of the user,by introducing the time decay factor and the Weibo hot search factor model the Weibo text data,which realizes the effective extraction of bursty features and the effective filtering of repeated pseudo bursty features,which overcomes the dynamic real-time variation of Weibo spatial characteristics and information noise.Noisy and difficult to judge the novelty of the topic.On this basis,the BTM(Biterm Topic Model)model is used to model the theme,and the k-means clustering algorithm is used to cluster the bursty features,which realizes the sorting of the topic distribution of the bursty features combined with the topic cluster.Thus,using the bursty feature to visualize the bursty topic,the final bursty topic is obtained.(2)Aiming at the characteristics of topic evolution,this paper proposes an evolution tracking method based on BTM topic model for Weibo bursty topics.Since the bursty topics that have been detected in Weibo will continue to evolve with the advancement of time series,even some bursty topics will be reversed,which also leads to differences in the focus of the user's attention to the bursty topic at different times.Therefore,in view of the dynamic real-time nature of Weibo,the improved and extended based on the probabilistic topic model BTM becomes the Weibo bursty topic evolution tracking model.Based on the BTM model,this method introduces a binary indicator variable to measure whether the topic of the extracted topic is the same as the subject of the detected bursty topic.If they are the same,they will be combined with the detected bursty topic to form a new topic set,and the topic set is divided by the time slice;the KL distance is used to calculate the distance of the bursty topic of the adjacent time slice,thereby analyzing the bursty.The topic evolution tracking situation,to achieve the integrity of the Weibo topic detection process.(3)Based on the above methods,this paper designs and implements a complete Weibo bursty topic detection and tracking demonstration system.The system implements a series of functions including data acquisition,text preprocessing,Weibo bursty topic detection,Weibo bursty topic evolution tracking,and provides visual representation of relevant information.
Keywords/Search Tags:Topic model, bursty feature, bursty topic, topic evolution, time sequence
PDF Full Text Request
Related items