Font Size: a A A

Research On Microblog Hot Topic Discovery And Evolution Analysis For Public Opinion Monitoring

Posted on:2019-04-20Degree:MasterType:Thesis
Country:ChinaCandidate:Y DangFull Text:PDF
GTID:2428330563997762Subject:Engineering
Abstract/Summary:PDF Full Text Request
As one of the popular social tools,Weibo is deeply loved by Internet users.Weibo has features such as short content,simple operation,spread timely,and freedom of speech,which allows netizens to express their opinions,obtain all kinds of information what they want,forward and comment on others' information.With the continuous expansion of the number of Internet users,microblogging tools are used frequently,resulting in explosive growth of data focused on the microblogging platform.The format and content of these data may be very scattered,if selected by hand,it will not only increase the workload,but it is difficult to find hot topics quickly.Based on the existing hot topics discovery technology,most of the traditional scholars often use algorithm about text clustering relying on the vector space model,and have achieved good results when processing long text data,but when dealing with the short text content such as microblogs,which are short and with few feature words,simply judging the similarity by the meaning of words will affect the accuracy of topic discovery.Based on this,the research will be completed in this paper is as follows:In this paper,the Latent Dirichlet Allocation(LDA)model is used to discover hot topics of microblogs at different time,the accuracy of the topic discovery is verified by comparing with traditional K-means algorithm.In the process of Weibo topic discovery,the number of topics need to be set manually.This article adopts the Chinese Restaurant Process to determine the number of Weibo topics dynamically and avoid manual participation in setting the number of topics.The data is updated all the time of the net.It is impossible to obtain all the data at one time,if the data is re-learned each time,it will not only consume a lot of time,but also cannot track the topic in time.The topics exists evolutionary characteristics,which discussed with different focuses at different time.In order to capture the evolution of topics in time,a dynamic incremental topic evolution model is built based on the hot topics discovery,and the data is divided as historical data sets and incremental data sets according to time.Using the result of hot topics discovery to infer topics distribution of incremental data sets,and tracking topics about content.Through experimental analysis,this model can demonstrate the evolution of topic content and save time.A system about microblog hot topic discovery and evolution analysis is designed and developed,including data preprocessing module,topic discovery module,topic evolution analysis modules,and personal information maintenance module.The system is tested and the functions of each module are displayed based on the actual data,and verifies the feasibility and effectiveness of the above work.
Keywords/Search Tags:Hot topic discovery, LDA model, Topic evolution, Chinese Restaurant Process
PDF Full Text Request
Related items