Font Size: a A A

Research On Topic Evolution Analysis Based On Topic Word Embedding Model

Posted on:2020-06-26Degree:MasterType:Thesis
Country:ChinaCandidate:L GuoFull Text:PDF
GTID:2428330590963044Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In recent years,hot topics have emerged in an endless stream,constantly igniting network public opinion.Almost every hot topic has overwhelming information.Effective analysis of topic evolution can help people understand the development of topics in a timely manner,master the evolution of topics,summarize the rules of topic development,and provide assistance for scientific decision-making.Existing research analysis is relatively rough and not deep enough,and there are three main problems as following:(1)Focusing on capturing global semantic information of topic,without taking into account local lexical semantic information,so semantic coherence is poor;(2)Positioning accuracy of events which have significant impact on the topic development process is poor;(3)It is inefficient to clarify evolutionary development of a topic and display its evolution trend.The topic word embedding model can effectively solve or improve the problems above.This thesis studies topic evolution analysis technology based on the topic word embedding model.The main research results are listed as follows:(1)New event detection within topics is studied.Traditional topic models cannot effectively balance the subject semantic information and local lexical semantic information lie of documents.In actual application,the performance of new event detection is not ideal and fluctuates greatly.A new event detection method within topics based on topic word embeddings clustering is proposed.Firstly,pre-processed documents are trained by the topic word embedding model to obtain topic word embeddings,which can effectively balance the global topic semantic information and local lexical semantic information.Then,K-means clustering is performed on obtained topic word embeddings to obtain the distribution of sub-topics within the topic;Finally,based on the order of the time stamps of documents included in each subtopic,new event detection in the topic is completed.Experimental results show that this method achieves better performance than traditional new event detection methods.(2)Event evolution relationship identification is researched.The traditional word feature vector space model cannot accurately represent event semantics,and the comparison of event similarity stays at the word level.An event evolution relationship identification method based on the topic word embedding model is represented.Firstly,the topic word embedding model is used to train documents to obtain topic word embeddings.Then,an event vector is constructed by using the topic word embeddings corresponding to an event.Finally,event similarity is calculated with event vectors to complete event evolution relationship identification.Experimental results show that compared with existing related researches,the method can improves performance of event evolution relationship recognition.(3)Topic evolution graph construction is explored.Traditional topic evolution graph construction methods fail to deeply explore topic semantic information and lexical semantic information,and need to specify the number of clusters in advance.A topic evolution graph construction method based on event vector clustering is put forward.First,event vectors are generated by using topic word embeddings;secondly,event vectors are clustered to realize clustering of documents;then,nodes are found according to the category label of document,and the corresponding event vector is used to establish the edge between the nodes;Finally,a node representative document is selected and a topic evolution graph based on the node side is built.Experimental results show that the method can generate a clear topic evolution map,and visualize the context of topic evolution effectively.
Keywords/Search Tags:Topic evolution analysis, Topic word embedding model, New event detection within topics, Event evolution relationship identification, Topic evolution graph construction
PDF Full Text Request
Related items