Font Size: a A A

Research On Sentence Level Chinese Event Extraction

Posted on:2012-08-12Degree:MasterType:Thesis
Country:ChinaCandidate:X DingFull Text:PDF
GTID:2218330362450422Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the explosive growth of Internet information, information extraction technology becomes more and more important. Event extraction is a crucial task of information extraction. It is designed to accurately extract events and arguments such as time, place and people etc. related to the event from unstructured text and store in the form of structured text.Event extraction technology can be widely applied to the top research of nature language processing, such as automatic summarization, automatic question answering, information retrieval, public opinion monitoring, topic detecting etc. and also can help users easily to view information. Traditional event extraction takes predefined event types as input, and then based on machine learning approach or pattern matching approach extract event and its related arguments. In previous studies, there are seldom researches test the robustness of their approaches on variety of different sources corpus.This paper summarizes the experiences and the lack of traditional event extraction system. Based on the summarization, we present BUEES– a Bottom to Up Event Extraction System, which extracts events from the Web without predefined event types and large scale hand-tagged corpora.In order to test how robust our approach is, we evaluate our methods on three different data sets: ACE 05, Finance News and Music News.1.This paper is the first to propose trigger clustering based event type paradigm building approach. Based on this method we not only automatically discover 33 target types of events defined in ACE 2005 corpus, but also achieve good performance on Finance News and Music News corpus. The result shows that the approach is robustness and domain adaptive.2. Incorporating external dictionary resource into event extraction task to solve the data sparseness problem in ACE corpus. This paper proposes automatic expansion of event triggers algorithm based on TongYiCi CiLin. The approach successfully integrates into the external resources and the wealth of semantic background knowledge and achieves good performance on ACE 2005 corpus.3. Proposing pattern generalization approach to solve low recall score of pattern matching based event argument recognition method.This paper proposes BestMatch based pattern generalization algorithm. According to the costs defined above, the Soft-Pattern Learner is able to find the best generalization of any two instance patterns. The ACE 2005 corpus experimental result proves that the method can,to some extent, solve the low recall score of pattern matching.4. For event arguments extraction, this paper presents the extraction method of the event argument's key word based ondependency parsing; and we have also applied theNoun Phrase parsing to the recognition of the noun phrasewhere the event argument locates. This method is a good combination of the advantages of the two syntactic parsers.
Keywords/Search Tags:Event Extraction, Event Type Discovery, Event Type Recognition, Event Argument Recognition, Bottom-Up Event Extraction System
PDF Full Text Request
Related items