Font Size: a A A

Design And Implementation Of Event Evolution Analysis System Based On Topic Model

Posted on:2021-03-23Degree:MasterType:Thesis
Country:ChinaCandidate:S Q JiFull Text:PDF
GTID:2428330620464207Subject:Engineering
Abstract/Summary:PDF Full Text Request
With the continuous development of Internet and multimedia technology,news media has become an important tool for people to understand the development of events.As the carrier of events,news has the characteristics of authenticity,timeliness,openness and variability.How to excavate the information of each stage of the development of events from the news released by the media and accurately analyze the evolution context and evolution heat index of events is beneficial to people They have a comprehensive understanding of news events,and help the government and news media guide and control the development direction of news events.At present,the research of event evolution analysis mainly has the following problems:(1)the expression of news events is not comprehensive;(2)the direction of event evolution can not be effectively analyzed;(3)the lack of a complete visual system of event evolution analysis.In this thesis,Xinhua news is taken as the research data,and the evolution context and evolution heat index of news events are taken as the research content.On the basis of summarizing and analyzing the current situation at home and abroad and combing the relevant principles and technologies,firstly,the text expression mode is improved,and the news text is expressed from multiple angles by using feature vector,semantic vector and theme vector.Secondly,it is realized based on the improved single pass clustering algorithm The new event detection,event category labeling,and the implicit Dirichlet distribution model are used to analyze the event evolution process.At the same time,the event evolution analysis system is built according to the research content.The specific research contents are as follows:(1)In this thesis,a text quantization expression method based on multi vector fusion is proposed.Based on TF-IDF feature selection algorithm,the feature vector is generated,and the document topic matrix generated by the topic model is used as the topic vector.Aiming at the sparsity of the word vector generated by the word embedding model,the seq2 seq model is introduced to compress the space to generate the semantic vector.The combination of feature vector,topic vector and semantic vector forms text vector,which makes the quantitative expression of news text have feature information,topic information and semantic information,and provides comprehensive and accurate vector input for subsequent news event analysis.(2)The event evolution analysis based on topic model is studied.First,aiming at the problem of sequence sensitivity of text input flow in the classical single pass algorithm,a single pass clustering algorithm based on double threshold clustering criteria is proposed,which is applied to news topic detection and tagging.Secondly,it analyzes the evolution of events in the topic,divides the time window according to the smoothness of key words and the uniformity of news distribution in the topic,uses the implicit Dirichlet distribution theme model to analyze the development stage of events and generate the evolution context of events,and calculates the heat index trend in the process of event evolution based on the heat formula.Finally,the evolution direction of events is analyzed from two aspects of evolution context and evolution index.(3)The event evolution analysis system is designed and implemented.This system takes the algorithm as the core and the news collection,news annotation,news evolution context and news heat index as the main function modules.It designs and implements the event evolution analysis system.Through the function test and performance test of the event evolution analysis system,it is verified that the system can provide users with comprehensive event evolution information.
Keywords/Search Tags:text vectorization, theme model, event heat, event evolution context
PDF Full Text Request
Related items