Font Size: a A A

Research On Information Extraction Of Emergency Event From Online News

Posted on:2013-03-26Degree:MasterType:Thesis
Country:ChinaCandidate:Y F HanFull Text:PDF
GTID:2248330395480581Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
When emergency event occurs, related online news grows exponentially in number in thenetwork information explosion era. Facing vast amounts of network data, how to obtain accurateand valid emergency information is vital for network users and emergency decision-makingbodies to locate information on the Internet and accurately grasp the emergency developmentmomentum. Information extraction of emergency event from online news focuses on how toextract emergency information from online news using event extract technology, and then buildsEvent Representation Model to describe the topic of emergency. This paper presents a thoroughstudy of text classification, event extraction and topic description. The main contributions of thepaper are as follows:(1) The architecture of emergency categories is studied thoroughly and a method ofhierarchical text classification for emergency event based on domain feature is proposedaccording to the hierarchical structure of emergency categories. Firstly, according to the categoryhierarchy a virtual category tree is organized. Then the multi-class SVM classifier based onbinary tree is built in each layer of the virtual category tree. The binary tree is generated underthe guidance of the sum of class distance. The concept of domain feature is introduced and theautomatic extraction algorithm is given for text feature selection in this stage. Finally, the text isclassified layer by layer in accordance with category level. Experiments show that the featureselection method based on domain feature automatic extraction algorithm is better than thecommon methods. The method of hierarchical text classification of emergency event caneffectively reduce the time complexity and the risk of misclassification, and improve theclassification performance at the same time.(2) The traditional method of event extraction is based on trigger-driven, but this methodmay cause positive and negative samples imbalance. Furthermore, there will be data sparsenessproblem when the corpus is small. Aiming these problems, an approach of sentence-level eventextraction based on ISODATA is designed. Firstly, the feature set which is used to train theMaximum Entropy Model for identifying event mention is built under the guidance of the eventsemantics pattern. Then, the multi-feature fusion based event similarity calculation method isdiscussed and event mention is clustered using ISODATA, the split and merge handling of whichhas been improved to make it applicable to non-numeric sample clustering. Finally, using KNNclassification algorithm, the type of each event category is identified to complete the eventextraction. For the events of weak associations between trigger and event types and ambiguitiesin identifying types of events, experiments show that this method improves the event classifyingperformance through events of the same type inference and testing, thereby enhancing theoverall performance of the event extraction, and the algorithm has the ability to automaticallydiscover the undefined types of events.(3) A method of emergency topic description based on event framework is proposed in orderto describe emergency event topic more reasonably and acquire topic information more efficiently. Firstly, using Framing Theory for reference, the definition of Event Framework isgiven and Emergency Framework based on emergency life cycle model is constructed. Then,event extraction and automatic summarization technology is combined within EmergencyFramework to extract and organize emergency topic, and an event extraction basedmulti-document summarization algorithm is discussed. Finally, the result of emergency topicdescription is given as summary. Experiments show that this method allows users to obtain mostof the emergency topic information in the conditions of information compression, thereby theefficiency of information acquisition is improved; thus it is an effective emergency topicdescription method.
Keywords/Search Tags:Emergency Event, Domain Feature, Hierarchical Text Classification, Maximum Entropy Model, ISODATA, Event Extraction, Event Frame, Multi-DocumentSummarization, Topic Description
PDF Full Text Request
Related items