Font Size: a A A

Research On Evolution Model Of Microblog Topic Based On Time Sequence

Posted on:2020-06-03Degree:MasterType:Thesis
Country:ChinaCandidate:P Y ZhangFull Text:PDF
GTID:2428330602452145Subject:Information Science
Abstract/Summary:PDF Full Text Request
The Internet has penetrated into every aspects of national life,and new media has gradually become the main platform for information release and communication.Online public opinions have become an important part of the whole social public opinions.Because of the virtuality and openness of the network,online public opinions fluctuate repeatedly in the process of communication.Especially in the long-term evolution of hot events,there are many peaks,and the content focus is likely to have a dynamic migration with the change of time,showing the characteristics of gradual change in stages.Therefore,timely detection of the migration of hot topic focus and dynamic tracking of the evolution trend of topics can provide a more complete dynamic evolution track of events and help netizens grasp the context of news events more intuitively and clearly.This is of great significance to the analysis of the situation of online public opinion,and is also an important content of the analysis of online public opinion analysis.Firstly,the paper analyzes the meaning of the topic evolution and makes a deep research on the current situation of the topic evolution model.Then according to the characteristics of micro-blog text data puts forward the topic evolution framework.Topic events have different topic focuses at each stage of topic development,so we can segment topics by time slices.By analyzing the focus and content changes of topics in different time slices,we can understand the evolution law of topics.This paper analyzes the distribution characteristics of focus feature words and noise words,constructs the extraction formula of focus words,and establishes the focus feature words collection.Firstly,the Ship-gram model is used to train the word vector model on the text set.Input the text of each time slice into the BTM to get the candidate theme.In BTM thematic dimension,the theme word vector is constructed.Secondly,k-means algorithm is used to cluster the theme word vector to get the fused theme.And the topic evolution of the text set on time slice is established.Through experiments on real data sets and comparisons with related methods,it can be found that the focus topic recognition based on word vector can effectively extract topics at all stages.In the method of introducing word vector,the similarity between words is fullyexplored to improve the effect of topic clustering.At the same time,it completes the analysis in terms of topic content and intensity.WMD algorithm is used to calculate the similarity between topics.In the analysis of topic strength,the calculation method based on micro-blog weight and topic probability is proposed.
Keywords/Search Tags:Network Topics, Focus Features, Biterm Topic Model, Word Embedding, Topic Similarity, Topic Evolution
PDF Full Text Request
Related items