Font Size: a A A

The Research Of Topic Tracking Based On Event Network

Posted on:2014-06-08Degree:MasterType:Thesis
Country:ChinaCandidate:D WangFull Text:PDF
GTID:2268330422953992Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Topic detection and tracking is a technology focused on news informationrecognition, data minning and orgnization for news reports.With those technology,computer is able to select and filter the information on the Internet, therebyimproving the efficiency of people to obtain the useful information. In the tasks ofTDT, text representation is the foundation of TDT. Traditional text representmethods are mainly based on word frequency statistics, which have two majordrawbacks:1. The methods based on word frequency statistics lack semanticinformation, which is the bottleneck to improve the results of detemination.2. Forthe online topic tracking, the focal point of the topic is changing over time, topicrepresentation should also change, however, the traditional methods have difficultyin buildint up a adaptive topic model. In analysis of news reports, it is found thatevent is a major clue throughout report and topic. Event is considered to be the basisunit to represent text, the relations between events is to describe the development,and an event network is set up to represent one report or one topic. In contrast to thetraditional text representation methods, on one hand, event is a certain semantic unitwhich includes the event arguments such as location, time, subject, object, therefore,event network is method with enough semantic information, on the other hand, eventrelations can be set up by the common event arguments for the events across thetexts, which is convenient to update topic model by adding or removing events in theevent network. In this paper, according to analyze the drawbacks of current TDTmethods, the event network method for text representation is proposed, which isadopted in a topic tracking system. In order to obtain the full event information, anevent arguments reasoning method based on event ontology is proposed and built anevent ontology for emergency. A sub-topic partition method is put forward to set upthree-layer topic model: event, sub-topic, topic. The similarity calculation betweentopic and reports in the sub-topic layer in order to avoid the describe particle size disparity. The main contribution of this paper is as follows:An event-based ontology structure and related emergency ontology is built,which is the basis of the formalization of event classes and event relations. By usingthe event ontology, an event argument reasoning method is proposed to obtain thefull event information including event classed division, event argument reasoningrules and the reasoning algorithm.A text representation method based on event network is proposed. A topicmethod can be divided into three layers: event, sub-topic, topic. A sub-topic partitionmethod is proposed according to a community discovery algorithm in the eventnetwork, which translate the event network into a tree structure, and removal someless important relations to get event communities, create an objective function to getthe most reasonable sub-topics.A topic tracking method is proposed based on event network. The similaritycalculation in the sub-topic level other than the traditional method straightlycalculates on topic and reports level. To avoid the topic shift, the event networkrepresenting topic model can be adaptive update to describe more useful eventinformation by adding and removing event nodes.Those work above uses event network as a text representation method in orderto contain more semantic information. Three-layer topic model is more reasonable tosimilarity calculation by considering the different describe ability of topic and reportand set a unify threshold to different topics. The event network topic model is veryconvenient to update by adding or removing nodes.
Keywords/Search Tags:event network, event ontology, sub-topic partition, topic tracking
PDF Full Text Request
Related items