Font Size: a A A

The Design And Implementation Of The New Event Detection System Based On LDA

Posted on:2019-09-25Degree:MasterType:Thesis
Country:ChinaCandidate:X Y ZhangFull Text:PDF
GTID:2428330551960301Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet,news media portals,social platforms and other electronic media have become the major way for people to get the latest information.Online media information is usually categorized into topics.The discovery of new topics is often started with the first report on a certain event.It is called the First Story Detection,also known as New Event Detection.New Event Detection is one of the important research direction in natural language processing,text data mining with great application value in news media industry,internet retrieval engine and recommendation system.However,it is of great challenge for New Event Detection.The new event detection task is to find a document that reports a new event from the documents to be detected.In this paper,a new event detection method based on novelty calculation was adopted.The basic idea of that method adopts two-phase framework where the text to be tested is to classify into its own topic using LDA in the first schema,and the second schema is called novelty calculation,in which text similarities was compared using vector representation in terms of common feature words and named entity words,and the similarities between the text to be detected and the history documents belong to that topic was calculated.If similarity was lower than a threshold,it was judged as a new event,and was tagged and added to the historical document records.The main purpose of this paper is to design and implement a new event detection system.In our system,document information management module,new event detection module,parameter setting module,and monitoring processes module was realised.The new event detection module can detect new events in documents with automation.Users can check the document periodically and quantitatively after setting.The detection returns the prompt message and saves the detection result.The document information management module allowed users to insert,delete,and view document with detail information,including document names,contents,and et.al.The parameter setting module allowed the administrator to modify and save the parameters involved in the system.Users can view system parameters.Themonitoring processes module can automatically update data and save text information within the system.
Keywords/Search Tags:New Event Detection, Topic Detection, Novelty calculation, natural language processing
PDF Full Text Request
Related items